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MATRIX-REMODELING GENES 



TECHNICAL FIELD 



5 



The invention relates to novel matrix-remodeling genes identified by their 



coexpression with known matrix-remodeling genes. The invention also relates to the use of 
these biomolecules in diagnosis, prognosis, prevention, treatment, and evaluation of therapies 
for diseases, particularly diseases associated with matrix-remodeling such as cancer, 
cardiomyopathy, arthritis, angiogenesis, diabetic necrosis, atherosclerosis, fibrosis, and 
10 ulceration. 

BACKGROUND OF THE INVENTION 

Matrix remodeling is associated with the construction, destruction, and reorganization 
of extracellular matrix components and is essential in normal cellular functions and also in 
many disease processes. These disease processes include metastatic cancer, cardiomyopathy, 

15 arthritis, angiogenesis, diabetic necrosis, atherosclerosis, fibrosis, and ulceration (Alexander 
and Werb (1991) In: Cell Bioloev of Extracellular Matrix . Plenum Press, New York NY, pp. 
255-302; Schuppan et al. (1993) In: Extracellular Matrix, Marcel Dekker, New York NY, pp. 
201-254; Zvibel and Kraft (1993) In: Extracellular Matrix . Marcel Dekker, New York NY, pp. 
559-580; Shanahan et al. (1994) J Clin Invest 93:2393-402; Kielty and Shuttleworth (1995) Int 

20 J Biochem Cell Biol 27:747-60; Bitar and Labbad (1996) J Surg Res 61:113-9; Dourado et al. 
(1996) Osteoarthritis Cartilage 4:187-96; Grant et al. (1996) Regul. Pept. 67:137-44; Gunja- 
Smith et al. (1996) Am J Pathol 148:1639-48; Alcolado et al. (1997) Clin. Sci 92:103-12; Cs- 
Szabo et al. (1997) Arthritis Rheum 40:1037-45; Hayward and Brock (1997) Hum Mutat 
10:415-23; Ledda et al. (1997) J Invest Dermatol 108:210-4; Hayashido et al. (1998) Int J 

25 Cancer 75:654-8; Ito et al. (1998) Kidney Int 53:853-61; Nelson et al. (1998) Cancer Res 
58:232-6). 

Many genes that participate in and regulate matrix remodeling are known, but many 
remain to be identified. Identification of currently unknown genes will provide new 
diagnostic and therapeutic targets. In addition, these genes will provide new opportunities for 
30 therapeutic tissue engineering-the use of drugs or biologicals to direct the creation of new 
tissues such as skin, pancreas, or liver that can replace tissues lost to disease or trauma. 
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The present invention provides new compositions that are useful for diagnosis, 
prognosis, treatment, prevention, and evaluation of therapies for diseases associated with 
matrix remodeling. We have implemented a method for analyzing gene expression patterns 
and have identified 20 novel matrix-remodeling genes by their coexpression with known 
5 matrix-remodeling genes. 



In one aspect, the invention provides for a substantially purified polynucleotide 
comprising a gene that is coexpressed with one or more knovm matrix-remodeling genes in a 
plurality of biological samples. Preferably, each known matrix-remodeling gene is selected 

10 from the group consisting of osteonectin (BM-40), chondroitin/dermatan sulfate proteoglycans 
(C/DSPG), collagen I, II, II, and IV, connective tissue growth factor (CTGF), fibrillin, 
fibronectins, fibronectin receptor (fibr-r), fibulin 1, heparan sulfate proteoglycans (HSPG), 
extracellular matrix protein (hevin), insulin-like growth factor 1 (IGF 1), insulin-like growth 
factor binding protein (IGFBP), laminin, lumican, matrix Gla protein (MGP), matrix 

15 metalloproteases (MMP), and tissue inhibitors of matrix metalloproteinase 1, 2, and 3 (TIMP 
1, 2, and 3). Preferred embodiments are (a) a polynucleotide sequence selected from the 
group consisting of SEQ ID NOs:l-20; (b) a polynucleotide sequence which encodes a 
polypeptide sequence of SEQ ID NOs:21, 22, and 23; (c) a polynucleotide sequence having 
at least 70% identity to the polynucleotide sequence of (a) or (b); (d) a polynucleotide 

20 sequence comprising at least 18 sequential nucleotides of the polynucleotide sequence of (a), 
(b), or (c); (e) a polynucleotide which hybridizes under stringent conditions to the 
polynucleotide of (a), (b), (c), or (d); or (f) a polynucleotide sequence which is 
complementary to the polynucleotide sequence of (a), (b), (c), (d) or (e). Furthermore, the 
invention provides an expression vector comprising any of the above described 

25 polynucleotides and host cells comprising the expression vector. Still further, the invention 
provides a method for treating or preventing a disease or condition associated with the altered 
expression of a gene that is coexpressed with one or more known matrix-remodeling genes in 
a sample comprising administering to a subject in need the above-described polynucleotides in 
an amount effective for treating or preventing said disease. 

30 In a second aspect, the invention provides a substantially purified polypeptide 

comprising the gene product of a gene that is coexpressed with one or more known matrix- 
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remodeling genes in a plurality of biological samples. The known matrix-remodeling gene 
may be selected from the group consisting of osteonectin (BM-40), chondroitin/dermatan 
sulfate proteoglycans (C/DSPG), collagen I, II, II, and IV, connective tissue growth factor 
(CTGF), fibrillin, fibronectins, fribonectin receptors (fibr-r), fibulin 1, heparan sulfate 
5 proteoglycans (HSPG), extracellular matrix protein (hevin), insulin-like growth factor 1 (IGF 
1), insulin-like growth factor binding protein (IGFBP), laminin, lumican, matrix Gla protein 
(MGP), matrix metalloproteases (MMP), and tissue inhibitors of matrix metalloproteinase 1, 
2, and 3 (TIMP 1, 2, and 3). Preferred embodiments are polypeptides comprising (a) the 
polypeptide sequence of SEQ ID NO:21, 22, or 23; (b) a polypeptide sequence having at least 
10 85% identity to the polypeptide sequence of (a); and (c) a polypeptide sequence comprising at 
least 6 sequential amino acids of the polypeptide sequence of (a) or (b). Additionally, the 
invention provides antibodies that bind specifically to any of the above described polypeptides 
and a method for treating or preventing a disease or condition associated with the altered 
expression of a gene that is coexpressed with one or more known matrix-remodeling genes in 
1 5 a sample comprising administering to a subject in need such an antibody in an amount 
effective for treating or preventing said disease. 

In another aspect, the invention provides a pharmaceutical composition comprising 
the polynucleotide of claim 2 or the polypeptide of claim 3 in conjunction with a suitable 
pharmaceutical carrier or a method for treating or preventing a disease or condition associated 
20 with the altered expression of a gene that is coexpressed with one or more known matrix- 
remodeling genes in a sample comprising administering to a subject in need such 
compositioning in an amount effective for treating or preventing said disease. 

In yet a further aspect, the invention provides a method for diagnosing a disease or 
condition associated with the altered expression of a gene that is coexpressed with one or more 
25 known matrix-remodeling genes in a sample, wherein each knovm matrix-remodeling gene is 
selected from the group consisting of osteonectin (BM-40), chondroitin/dermatan sulfate 
proteioglycans (C/DSPG), collagen I, II, II, and IV, connective tissue growth factor (CTGF), 
fibrillin, fibronectins, fibronectin receptor (fibr-r), fibulin 1, heparan sulfate proteoglycans 
(HSPG), extracellular matrix protein (hevin), insulin-like growth factor 1 (IGF 1), insulin-like 
30 growth factor binding protein (IGFBP), laminin, lumican, matrix Gla protein (MGP), matrix 
metalloproteases (MMP), and tissue inhibitors of matrix metalloproteinase 1 , 2, and 3 (TIMP 



3 



wo 00/21986 



• 



PCTAJS99/23315 



1, 2, and 3). The method comprises the steps of (a) providing the sample comprising one of 
more of said coexpressed genes; (b) hybridizing the polynucleotide of the coexpressed genes 
under conditions effective to form one or more hybridization complexes; and (c) detecting the 
hybridization complexes, wherein the altered level of one or more of the hybridization 
5 complexes in a diseased sample compared with the level of hybridization complexes in a non- 
diseased sample correlates with the presence of the disease or condition in the sample. 
BRIEF DESCRIPTION OF THE SEQUENCE LISTING 
The Sequence Listing provides exemplary matrix-remodeling-associated sequences 
including polynucleotide sequences, SEQ ID NOs:l-20, and polypeptide sequences, SEQ ID 
10 NOs:21-23. Each sequence is identified by a sequence identification number (SEQ ID NO) 
and by the Incyte Clone number from which the sequence was first identified. 

DESCRIPTION OF THE INVENTION 
It must be noted that as used herein and in the appended claims, the singular forms "a," 
"an," and "the" include the plural reference unless the context clearly dictates otherwise. 
1 5 Thus, for example, a reference to "a host cell" includes a plurality of such host cells, and a 
reference to "an antibody" is a reference to one or more antibodies and equivalents thereof 
known to those skilled in the art, and so forth. 
DEFINITIONS 

"NSEQ" refers generally to a polynucleotide sequence of the present invention, 
20 including SEQ ID NOs: 1 -20, "PSEQ" refers generally to a polypeptide sequence of the 
present invention, including SEQ ID NOs:21-23. 

A " variant" refers to either a polynucleotide or a polypeptide whose sequence diverges 
from SEQ ID NOs:l-20 or SEQ ID NOs:21-23, respectively. Polynucleotide sequence 
divergence may result from mutational changes such as deletions, additions, and substitutions 
25 of one or more nucleotides; it may also occur because of differences in codon usage. Each of 
these types of changes may occur alone, or in combination, one or more times in a given 
sequence. Polypeptide variants include sequences that possess at least one structural or 
functional characteristic of SEQ ID NOs:21-23, 

A "fragment" can refer to a nucleic acid sequence that is preferably at least 20 nucleic 
30 acids in length, more preferably 40 nucleic acids, and most preferably 60 nucleic acids in 
length, and encompasses, for example, fragments consisting of nucleic acids 1-50 or 200-500 
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of SEQ ID NOs:l-20. A "fragment" can also refer to polypeptide sequences which are 
preferably at least 5 to about 15 amino acids in length, most preferably at least 10 amino acids 
long, and which retain some biological activity or inununological activity of, for example, a 
sequence selected from SEQ ID NOs:21-23. 
5 "Gene" or "gene sequence" refers to the partial or complete coding sequence of a 

transcript. The term also refers to sequences corresponding to 5' or 3' untranslated regions or 
5' or 3' untranslated regions including partial or complete coding sequences of a gene. 
Typically, the novel gene sequences may or may not be homolgous to annotated sequences 
found in public or private databases. The gene may be in a sense or antisense 

10 (complementary) orientation. 

"Known matrix-remodeling gene" refers to a gene sequence which has been previously 
identified as useful in the diagnosis, treatment, prognosis, or prevention of diseases associated 
with matrix remodeling. Typically, this means that the known matrix-remodeling gene is 
expressed at higher levels in tissue abundant in known matrix-remodeling transcripts when 

1 5 compared with other tissue. 

"Matrix-remodeling gene" refers to a gene sequence whose expression pattern is 
similar to that of the known matrix-remodeling genes and which are useful in the diagnosis, 
treatment, prognosis, or prevention of diseases associated with matrix remodeling. The gene 
sequences can also be used in the evaluation of therapies for cancer. 

20 "Substantially purified" refers to a nucleic acid or an amino acid sequence that is 

removed from its natural environment and is isolated or separated, and is at least about 60% 
free, preferably about 75% free, and most preferably about 90% free from other components 
with which it is naturally present. 
THE INVENTION 

25 The present invention encompasses a method for identifying biomolecules that are 

associated with a specific disease, regulatory pathway, subcellular compartment, cell type, 
tissue type, or species. In particular, the method identifies gene sequences useful in diagnosis, 
prognosis, treatment, prevention, and evaluation of therapies for diseases associated with 
matrix-remodeling, particularly, cancer, cardiomyopathy, arthritis, angiogenesis, diabetic 

30 necrosis, atherosclerosis, fibrosis, and ulceration. 

The method provides first identifying polynucleotides that are expressed in a plurality 
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of cDNA libraries. The identified polynucleotides include genes of known function, genes 
known to be specifically expressed in a specific disease process, subcellular compartment, cell 
type, tissue type, or species. Additionally, the polynucleotides include genes of unknown 
function. The expression patterns of the known genes are then compared with those of the 
5 genes of unknown function to determine whether a specified coexpression probability 
threshold is met. Through this comparison, a subset of the polynucleotides for unknown 
function genes having a high coexpression probability with the known genes can be identified. 
The high coexpression probability correlates with a particular coexpression probability 
threshold which is less than 0.001, and more preferably less than 0.00001. 

10 The polynucleotides originate from cDNA libraries derived from a variety of sources 

including, but not limited to, eukaryotes such as human, mouse, rat, dog, monkey, plant, and 
yeast and prokaryotes such as bacteria and viruses. These polynucleotides can also be selected 
from a variety of sequence types including, but not limited to, expressed sequence tags 
(ESTs), assembled polynucleotide sequences, full length gene coding regions, introns, 

15 regulatory sequences, 5' imtranslated regions, and 3* untranslated regions. To have statistically 
significant analytical results, the polynucleotides need to be expressed in at least three cDNA 
libraries. 

The cDNA libraries used in the coexpression analysis of the present invention can be 
obtained from blood vessels, heart, blood cells, cultured cells, connective tissue, epithelium, 

20 islets of Langerhans, neurons, phagocytes, biliary tract, esophagus, gastrointestinal system, 
liver, pancreas, fetus, placenta, chromaffin system, endocrine glands, ovary, uterus, penis, 
prostate, seminal vesicles, testis, bone marrow, immime system, cartilage, muscles, skeleton, 
central nervous system, ganglia, neuroglia, neurosecretory system, peripheral nervous system, 
bronchus, larynx, lung, nose, pleurus, ear, eye, mouth, pharynx, exocrine glands, bladder, 

25 kidney, ureter, and the like. The number of cDNA libraries selected can range from as few as 
20 to greater than 10,000. Preferably, the number of the cDNA libraries is greater than 500. 

In a preferred embodiment, gene sequences are assembled to reflect related sequences, 
such as assembled sequence fragments derived from a single transcript. Assembly of the 
polynucleotide sequences can be performed using sequences of various types including, but 

30 not limited to, ESTs, extensions, or shotgun sequences. In a most preferred embodiment, the 
polynucleotide sequences are derived from human sequences that have been assembled using 
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the algorithm disclosed in "Database and System for Storing, Comparing and Displaying 
Related Biomolecular Sequence Infomiation", Lincoln et al.. Serial No:60/079,469, filed 
March 26, 1 998, incorporated herein by reference. 

Experimentally, differential expression of the polynucleotides can be evaluated by 
5 methods including, but not limited to, differential display by spatial immobilization or by gel 
electrophoresis, genome mismatch scanning, representational difference analysis, and 
transcript imaging. Additionally, differential expression can be assessed by microarray 
technology. These methods may be used alone or in combination. 

Known matrix-remodeling genes can be selected based on the use of the genes as 

10 diagnostic or prognostic markers or as therapeutic targets for diseases associated with matrix 
remodeling, such as cancer, cardiomyopathy, arthritis, angiogenesis, diabetic necrosis, 
atherosclerosis, fibrosis, and ulceration. Preferably, the known matrix-remodeling genes 
include osteonectin (BM-40), chondroitin/dermatan sulfate proteioglycans (C/DSPG), 
collagen I, II, II, and IV, connective tissue growth factor (CTGF), fibrillin, fibronectins, 

15 fibronectin receptor (fibr-r), fibulin 1, heparan sulfate proteoglycans (HSPG), extracellular 
matrix protein (hevin), insulin-like growth factor 1 (IGF 1), insulin-like growth factor binding 
protein (IGFBP), laminin, lumican, matrix Gla protein (MGP), matrix metalloproteases 
(MMP), tissue inhibitors of matrix metalloproteinase 1, 2, and 3 (TIMP 1, 2, and 3), and the 
like. 

20 The procedure for identifying novel genes that exhibit a statistically significant 

coexpression pattern with known matrix-remodeling genes is as follows. First, the presence or 
absence of a gene sequence in a cDNA library is defined: a gene is present in a cDNA library 
when at least one cDNA fragment corresponding to that gene is detected in a cDNA sample 
taken from the library, and a gene is absent from a library when no corresponding cDNA 

25 fragment is detected in the sample. 

Second, the significance of gene coexpression is evaluated using a probability method 
to measure a due-to-chance probability of the coexpression. The probability method can be 
the Fisher exact test, the chi-squared test, or the kappa test. These tests and examples of their 
applications are well known in the art and can be found in standard statistics texts (Agresti, A 

30 (1990) Categorical Data Analysis . John Wiley & Sons, New York NY; Rice, JA (1988) 

Mathematical Statistics and Data Analysis . Duxbury Press, Pacific Grove CA). A Bonferroni 



7 



wo 00/21986 



PCT/US99/23315 



correction (Rice, supra , page 384) can also be applied in combination with one of the 
probability methods for correcting statistical results of one gene versus multiple other genes. 
In a preferred embodiment, the due-to-chance probability is measured by a Fisher exact test, 
and the threshold of the due-to-chance probability is set to less than 0.001 , more preferably 
5 less than 0.00001. 

To determine whether two genes, A and B, have similar coexpression pattems, 
occurrence data vectors can be generated as illustrated in Table 1 , wherein a gene's presence is 
indicated by a one and its absence by a zero. A zero indicates that the gene did not occur in 
the library, and a one indicates that it occurred at least once. 



10 Table 1 . Occurrence data for genes A and B 





Library 1 


Library 2 


Library 3 




Library N 


gene A 


1 


1 


0 




0 


gene B 


1 


0 


1 




0 



15 For a given pair of genes, the occurrence data in Table 1 can be summarized in a 2 x 2 
contingency table. 

Table 2. Contingency table for co-occurrences of genes A and B 





Gene A present 


Gene A absent 


Total 


Gene B present 


8 


2 


10 


Gene B absent 


2 


18 


20 


Total 


10 


20 


30 



Table 2 presents co-occurrence data for gene A and gene B in a total of 30 libraries. Both gene 
A and gene B occur 10 times in the libraries. Table 2 summarizes and presents 1) the number 

25 of times gene A and B are both present in a library, 2) the number of times gene A and B are 
both absent in a library, 3) the number of times gene A is present while gene B is absent, and 
4) the number of times gene B is present while gene A is absent. The upper left entry is the 
number of times the two genes co-occur in a library, and the middle right entry is the number 
of times neither gene occurs in a library. The off diagonal entries are the number of times one 

30 gene occurs while the other does not. Both A and B are present eight times and absent 18 
times, gene A is present while gene B is absent two times, and gene B is present while gene A 
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is absent two times. The probability ("p-value") that the above association occurs due to 
chance as calculated using a Fisher exact test is 0.0003. Associations are generally considered 
significant if a p-value is less than 0.01 (Agresti, supra : Rice, supra). 

This method of estimating the probability for coexpression of two genes makes several 
5 assumptions. The method assumes that the libraries are independent and are identically 
sampled. However, in practical situations, the selected cDNA libraries are not entirely 
independent because more than one library may be obtained from a single patient or tissue, 
and they are not entirely identically sampled because different nimibers of cDNAs may be 
sequenced from each library (typically ranging from 5,000 to 10,000 cDNAs per library). In 
1 0 addition, because a Fisher exact coexpression probability is calculated for each gene versus 
41,419 other genes, a Bonferroni correction for multiple statistical tests is necessary. 

Using the method of the present invention, we have identified 20 novel genes that 
exhibit strong association, or coexpression, with known genes that are matrix-remodeling- 
specific. These known matrix-remodeling genes include osteonectin (BM-40), 
15 chondroitin/dermatan sulfate proteoglycans (C/DSPG), collagen I, II, II, and IV, connective 
tissue growth factor (CTGF), fibrillin, fibronectins, fibronectin receptor (fibr-r), fibulin 1, 
heparan sulfate proteoglycans (HSPG), extracellular matrix protein (hevin), insulin-like 
growth factor 1 (IGF 1), insulin-like growth factor binding protein (IGFBP), laminin, lumican, 
matrix Gla protein (MGP), matrix metalloproteases (MMP), and tissue inhibitors of matrix 
20 metalloproteinase 1, 2, and 3 (TIMP 1, 2, and 3). The results presented in Tables 5 and 6 
show that the expression of the 20 novel genes have direct or indirect association with the 
expression of known matrix-remodeling genes. Therefore, the novel genes can potentially be 
used in diagnosis, treatment, prognosis, or prevention of diseases associated with matrix 
remodeling, or in the evaluation of therapies for diseases associated with matrix remodeling. 
25 Further, the gene products of the 20 novel genes are potential therapeutic proteins and targets 
of therapeutics against diseases associated with matrix remodeling. 

Therefore, in one embodiment, the present invention encompasses a polynucleotide 
sequence comprising the sequence of SEQ ID NOs: 1-20. These 20 polynucleotides are shown 
by the method of the present invention to have strong coexpression association with known 
30 matrix-remodeling genes and with each other. The invention also encompasses a variant of 
the polynucleotide sequence, its complement, or 1 8 consecutive nucleotides of a sequence 
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provided in the above described sequences. Variant polynucleotide sequences typically have 
at least about 70%, more preferably at least about 85%, and most preferably at least about 95% 
polynucleotide sequence identity to NSEQ. 

One preferred method for identifying variants entails using NSEQ and/or PSEQ 
5 sequences to search against the GenBank primate (pri), rodent (rod), and mammalian (mam), 
vertebrate (vrtp), and eukaryote (eukp) databases, SwissProt, BLOCKS (Bairoch et aL (1997) 
Nucleic Acids Res 25:217-221), PFAM, and other databases that contain previously identified 
and annotated motifs, sequences, and gene functions. Methods that search for primary 
sequence patterns with secondary structure gap penalties (Smith et al. (1992) Protein 

10 Engineering 5:35-51) as well as algorithms such as BLAST (Basic Local Alignment Search 
Tool; Altschul (1993) J Mol Evol 36:290-300; and Altschul et al. (1990) J Mol Biol 215:403- 
410), BLOCKS (Henikoff and Henikoff (1991) Nucleic Acids Res 19:6565-6572), Hidden 
Markov Models (HMM; Eddy (1996) Cur Opin Str Biol 6:361-365; Sonnhammer et al. (1997) 
Proteins 28:405-420), and the like, can be used to manipulate and analyze nucleotide and 

15 amino acid sequences. These databases, algorithms and other methods are well known in the 
art and are described in Ausubel et al. (1997; Short Protocols in Molecular Biology . John 
Wiley & Sons, New York NY) and in Meyers (1995; Molecular Biologv and Biotechnology . 
Wiley VCH, New York NY, pp. 856-853). 

Also encompassed by the invention are polynucleotide sequences that are capable of 

20 hybridizing to SEQ ID NOs:l-20, and fragments thereof under stringent conditions. Stringent 
conditions can be defined by saU concentration, temperature, and other chemicals and 
conditions well known in the art. In particular, stringency can be increased by reducing the 
concentration of salt, or raising the hybridization temperature. Varying additional parameters, 
such as hybridization time, the concentration of detergent or solvent, and the inclusion or 

25 exclusion of carrier DNA, are well known to those skilled in the art. Additional variations on 
these conditions will be readily apparent to those skilled in the art (Wahi and Berger (1987) 
Methods Enzymol 152:399-407; Kunmel (1987) Methods Enzymol 152:507-51 1; Ausubel 
supra: and Sambrook et al. (1989) Molecular Cloning. A Laboratory ManuaL Cold Spring 
Harbor Press, Plainview NY). 

30 NSEQ or the polynucleotide sequences encoding PSEQ can be extended utilizing a 

partial nucleotide sequence and employing various PCR-based methods known in the art to 
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detect upstream sequences, such as promoters and regulatory elements. (See, e.g., 
Dieffenbach and Dveksler (1995) PGR Primer, a Laboratory Manual . Cold Spring Harbor 
Press, Plainview NY; Sarkar (1993) PGR Methods Applic 2:318-322; Triglia et al. (1988) 
Nucleic Acids Res 16:8186; Lagerstrom et al. (1991) PGR Methods Applic 1:1 1 1-1 19; and 
5 Parker et al. (1991) Nucleic Acids Res 19:3055-306). Additionally, one may use PGR, nested 
primers, and PROMOTERFINDER libraries (Clontech, Palo Alto, GA) to walk genomic 
DNA. This procedure avoids the need to screen libraries and is useful in finding intron/exon 
junctions. For all PCR-based methods, primers may be designed using commercially 
available software, such as OLIGO 4.06 Primer Analysis software (National Biosciences, 
10 Plymouth MN) or another appropriate program, to be about 1 8 to 30 nucleotides in length, to 
have a GG content of about 50% or more, and to anneal to the template at temperatures of 
about 68°G to 72°G. 

In another aspect of the invention, NSEQ or the polynucleotide sequences encoding 
PSEQ can be cloned in recombinant DNA molecules that direct expression of PSEQ or the 

15 polypeptides encoded by NSEQ, or structural or fiinctional fragments thereof, in appropriate 
host cells. Due to the inherent degeneracy of the genetic code, other DNA sequences which 
encode substantially the same or a functionally equivalent amino acid sequence may be 
produced and used to express the polypeptides of PSEQ or the polypeptides encoded by 
NSEQ. The nucleotide sequences of the present invention can be engineered using methods 

20 generally known in the art in order to alter the nucleotide sequences for a variety of purposes 
including, but not limited to, modification of the cloning, processing, and/or expression of the 
gene product, DNA shuffling by random fragmentation and PGR reassembly of gene 
fragments and synthetic oligonucleotides may be used to engineer the nucleotide sequences. 
For example, oligonucleotide-mediated site-directed mutagenesis may be used to introduce 

25 mutations that create new restriction sites, alter glycosylation patterns, change codon 
preference, produce splice variants, and so forth. 

In order to express a biologically active polypeptide encoded by NSEQ, NSEQ or the 
polynucleotide sequences encoding PSEQ, or derivatives thereof, may be inserted into an 
appropriate expression vector, i.e., a vector which contains the necessary elements for 

30 transcriptional and translational control of the inserted coding sequence in a suitable host. 
These elements include regulatory sequences, such as enhancers, constitutive and inducible 
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promoters, and 5' and 3' untranslated regions in the vector and in NSEQ or polynucleotide 
sequences encoding PSEQ. Methods which are well known to those skilled in the art may be 
used to construct expression vectors containing NSEQ or polynucleotide sequences encoding 
PSEQ and appropriate transcriptional and translational control elements. These methods 

5 include in vitro recombinant DNA techniques, synthetic techniques, and in vivo genetic 
recombination. (See, e.g., Sambrook (supra) and Ausubel (supra). 

A variety of expression vector/host cell systems may be utilized to contain and express 
NSEQ or polynucleotide sequences encoding PSEQ. These include, but are not limited to, 
microorganisms such as bacteria transformed with recombinant bacteriophage, plasmid, or 

10 cosmid DNA expression vectors; yeast transformed with yeast expression vectors; insect cell 
systems infected with viral expression vectors (baculovirus); plant cell systems transformed 
with viral expression vectors, cauliflower mosaic virus (CaMV) or tobacco mosaic virus 
(TMV), or with bacterial expression vectors (Ti or pBR322 plasmids); or animal cell systems. 
The invention is not limited by the host cell employed. For long term production of 

1 5 recombinant proteins in mammalian systems, stable expression of a polypeptide encoded by 
NSEQ in cell lines is preferred. For example, NSEQ or sequences encoding PSEQ can be 
transformed into cell lines using expression vectors which may contain viral origins of 
replication and/or endogenous expression elements and a selectable marker gene on the same 
or on a separate vector. 

20 In general, host cells that contain NSEQ and that express PSEQ may be identified by a 

variety of procedures known to those of skill in the art. These procedures include, but are not 
limited to, DNA-DNA or DNA-RNA hybridizations, PGR amplification, and protein bioassay 
or immunoassay techniques which include membrane, solution, or chip based technologies for 
the detection and/or quantification of nucleic acid or protein sequences. Immunological 

25 methods for detecting and measuring the expression of PSEQ using either specific polyclonal 
or monoclonal antibodies are known in the art. Examples of such techniques include 
enzyme-linked immunosorbent assays (ELISAs), radioimmunoassays (RIAs), and 
fluorescence activated cell sorting (FACS). 



30 be cultured under conditions suitable for the expression and recovery of the protein from cell 
culture. The protein produced by a transformed cell may be secreted or retained intracellularly 



Host cells transformed with NSEQ or polynucleotide sequences encoding PSEQ may 
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depending on the sequence and/or the vector used. As will be understood by those of skill in 
the art, expression vectors containing polynucleotides of NSEQ or polynucleotides encoding 
PSEQ may be designed to contain signal sequences which direct secretion of PSEQ or 
polypeptides encoded by NSEQ through a prokaryotic or eukaryotic cell membrane, 
5 In addition, a host cell strain may be chosen for its ability to modulate expression of 

the inserted sequences or to process the expressed protein in the desired fashion. Such 
modifications of the polypeptide include, but are not limited to, acetylation, carboxylation, 
glycosylation, phosphorylation, lipidation, and acylation. Post-translational processing which 
cleaves a "prepro" form of the protein may also be used to specify protein targeting, folding, 

10 and/or activity. Different host cells which have specific cellular machinery and characteristic 
mechanisms for post-translational activities (e.g., CHO, HeLa, MDCK, HEK293, and WI38), 
are available from the American Type Culture Collection (ATCC, Manassas VA) and may be 
chosen to ensure the correct modification and processing of the foreign protein. 

In another embodiment of the invention, natural, modified, or recombinant NSEQ or 

15 nucleic acid sequences encoding PSEQ are ligated to a heterologous sequence resulting in 
translation of a fusion protein containing heterologous protein moieties in any of the 
aforementioned host systems. Such heterologous protein moieties facilitate purification of 
fusion proteins using commercially available affinity matrices. Such moieties include, but are 
not limited to, glutathione S-transferase (GST), maltose binding protein (MBP), thioredoxin 

20 (Trx), calmodulin binding peptide (CBP), 6-His, FLAG, c-myc^ hemagglutinin (HA) and 
monoclonal antibody epitopes.. 

In another embodiment, NSEQ or sequences encoding PSEQ are synthesized, in whole 
or in part, using chemical methods well known in the art. (See, e.g., Caruthers et al. (1980) 
Nucleic Acids Symp Ser (7) 215-223; Horn et al. (1980) Nucleic Acids Symp Ser (7) 225-232; 

25 and Ausubel, supra) . Alternatively, PSEQ or a polypeptide sequence encoded by NSEQ itself, 
or a fragment thereof, may be synthesized using chemical methods. For example, peptide 
synthesis can be performed using various solid-phase techniques (Roberge et al. (1995) 
Science 269:202-204). Automated synthesis may be achieved using the ABI 431 A Peptide 
synthesizer (PE Biosystems, Foster City CA). Additionally, PSEQ or the amino acid 

30 sequence encoded by NSEQ, or any part thereof, may be altered during direct synthesis and/or 
combined with sequences from other proteins, or any part thereof, to produce a polypeptide 
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variant. 



In another embodiment, the invention provides a substantially purified polypeptide 
comprising the amino acid sequence selected from the group consisting of SEQ ID NO:21, 
SEQ ID NO:22, SEQ ID NO:23 or fragments thereof. 
5 DIAGNOSTICS and THERAPEUTICS 

The sequences of the these genes can be used in diagnosis, prognosis, treatment, 
prevention, and evaluation of therapies for diseases associated with matrix-remodeling, 
particularly cancer, cardiomyopathy, arthritis, angiogenesis, diabetic necrosis, atherosclerosis, 
fibrosis, and ulceration. Further, the amino acid sequences encoded by the novel genes are 
10 potential therapeutic proteins and targets of anti-cancer therapeutics or for the treatment of 
other diseases associated with matrix remodeling. 

In one preferred embodiment, the polynucleotide sequences of NSEQ or the 
polynucleotides encoding PSEQ are used for diagnostic purposes to investigate the altered 
expression of PSEQ, and to monitor regulation of the levels of mRNA or the polypeptides 
15 encoded by NSEQ during therapeutic intervention. The polynucleotides may be at least 18 
nucleotides long, and may be complementary RNA or DNA molecules, branched nucleic 
acids, or peptide nucleic acids (PNAs). Alternatively, the polynucleotides are used to detect 
and quantitate gene expression in samples in which expression of PSEQ or the polypeptides 
encoded by NSEQ are correlated with disease. Additionally, NSEQ or the polynucleotides 
20 encoding PSEQ can be used to detect genetic polymorphisms associated with a disease. These 
polymorphisms may be detected at the transcript cDNA or genomic level. 

The specificity of the probe, whether it is made fi-om a highly specific region, e.g., the 
5' regulatory region, or from a less specific region, e.g., a conserved motif, and the stringency 
of the hybridization or amplification (maximal, high, intermediate, or low), will determine 
25 whether the probe identifies only naturally occurring sequences encoding PSEQ, allelic 
variants, or related sequences. 

Probes may also be used for the detection of related sequences, and should preferably 
have at least 70% sequence identity to any of the NSEQ or PSEQ-encoding sequences. 



30 the cloning of NSEQ or polynucleotide sequences encoding PSEQ into vectors for the 

production of mRNA probes. Such vectors are known in the art, are commercially available. 



Means for producing specific hybridization probes for DNAs encoding PSEQ include 
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and may be used to synthesize RNA probes in vitro by means of the addition of the 
appropriate RNA polymerases and the appropriate labeled nucleotides. Hybridization probes 
may be labeled by a variety of reporter groups, for example, by radionuclides such as -^^P or 
^^S, or by enzymatic labels, such as alkaline phosphatase coupled to the probe via avidin/biotin 
coupling systems, by fluorescent labels and the like. The polynucleotide sequences encoding 
PSEQ may be used in Southern or northem analysis, dot blot, or other membrane-based 
technologies; in PGR technologies;and in microarrays utilizing fluids or tissues from patients 
to detect altered NSEQ expression. Such qualitative or quantitative methods are well known 
in the art. 

NSEQ or the nucleotide sequences encoding PSEQ can be labeled by standard 
methods and added to a fluid or tissue sample from a patient under conditions suitable for the 
formation of hybridization complexes. After a suitable incubation period, the sample is 
washed and the signal is quantitated and compared with a standard value, typically, derived 
from a non-diseased sample. If the amount of signal in the patient sample is significantly 
altered in comparison to the standard value then the presence of altered levels of nucleotide 
sequences of NSEQ and those encoding PSEQ in the sample indicates the presence of the 
associated disease. Such assays may also be used to evaluate the efficacy of a particular 
therapeutic treatment regimen in animal studies, in clinical trials, or to monitor the treatment 
of an individual patient. 

Once the presence of a disease is established and a treatment protocol is initiated, 
hybridization or amplification assays can be repeated on a regular basis to determine if the 
level of expression in the patient begins to approximate that which is observed in a healthy 
subject. The results obtained from successive assays may be used to show the efficacy of 
treatment over a period ranging from several days to months. 

The polynucleotides may be used for the diagnosis of a variety of diseases associated 
with matrix-remodeling including cancer such as adenocarcinoma, leukemia, lymphoma, 
melanoma, myeloma, sarcoma, teratocarcinoma, and, in particular, cancers of the adrenal 
gland, bladder, bone, bone marrow, brain, breast, cervix, gall bladder, ganglia, gastrointestinal 
tract, heart, kidney, liver, lung, muscle, ovary, pancreas, parathyroid, penis, prostate, salivary 
glands, skin, spleen, testis, thymus, thyroid, and uterus, cardiomyopathy, arthritis, 
angiogenesis, diabetic necrosis, atherosclerosis, fibrosis, and ulceration. 
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Alternatively, the polynucleotides may be used as targets in a microarray. The 
microarray can be used to monitor the expression level of large numbers of genes 
simultaneously and to identify splice variants, mutations, and polymorphisms. This 
information may be used to determine gene function, to understand the genetic basis of a 
5 disease, to diagnose a disease, and to develop and monitor the activities of therapeutic agents. 
In yet another alternative, polynucleotides may be used to generate hybridization 
probes useful in mapping the naturally occurring genomic sequence. Fluorescent in situ 
hybridization (FISH) may be correlated with other physical chromosome mapping techniques 
and genetic map data. (See, e.g., Heinz-Ulrich et al. (1995) in Meyers, supra , pp. 965-968.) 

10 In another embodiment, antibodies which specifically bind PSEQ may be used for the 

diagnosis of diseases characterized by the over-or-underexpression of PSEQ or polypeptides 
encoded by NSEQ. A variety of protocols for measuring PSEQ or the polypeptides encoded 
by NSEQ, including ELISAs, RIAs, and FACS, are well known in the art and provide a basis 
for diagnosing altered or abnormal levels of the expression of PSEQ or the polypeptides 

15 encoded by NSEQ. Standard values for PSEQ expression are established by combining body 
fluids or cell extracts taken from healthy subjects, preferably human, with antibody to PSEQ 
or a polypeptide encoded by NSEQ under conditions suitable for complex formation The 
amount of complex formation may be quantitated by various methods, preferably by 
photometric means. Quantities of PSEQ or the polypeptides encoded by NSEQ expressed in 

20 disease samples from, for example, biopsied tissues are compared with the standard values. 
Deviation between standard and subject values establishes the parameters for diagnosing or 
monitoring disease. Alternatively, one may use competitive drug screening assays in which 
neutralizing antibodies capable of binding PSEQ or the polypeptides encoded by NSEQ 
specifically compete with a test compound for binding the polypeptides. Antibodies can be 

25 used to detect the presence of any peptide which shares one or more antigenic determinants 
with PSEQ or the polypeptides encoded by NSEQ. 

In another aspect, the polynucleotides and polypeptides of the present invention can be 
employed for treatment or the monitoring of therapeutic treatments for cancer. The 
polynucleotides of NSEQ or those encoding PSEQ, or any fragment or complement thereof, 

30 may be used for therapeutic purposes. In one aspect, the complement of the polynucleotides 
of NSEQ or those encoding PSEQ may be used in situations in which it would be desirable to 
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block the transcription or translation of the mRNA. 

Expression vectors derived from retroviruses, adenoviruses, or herpes or vaccinia 
viruses, or from various bacterial plasmids, may be used for delivery of nucleotide sequences 
to the targeted organ, tissue, or cell population. Methods which are well known to those 
5 skilled in the art can be used to construct vectors to express nucleic acid sequences 

complementary to the polynucleotides encoding PSEQ. (See, e.g., Sambrook, supra : and 
Ausubel, supra .l 

Genes having polynucleotide sequences of NSEQ or those encoding PSEQ can be 
turned off by transforming a cell or tissue with expression vectors which express high levels of 

10 a polynucleotide, or fragment thereof, encoding PSEQ. Such constructs may be used to 
introduce untranslatable sense or antisense sequences into a cell. Oligonucleotides derived 
from the transcription initiation site, e.g., between about positions -10 and +10 from the start 
site, are preferred. Similarly, inhibition can be achieved using triple helix base-pairing 
methodology. Triple helix pairing is useful because it causes inhibition of the ability of the 

15 double helix to open sufficiently for the binding of polymerases, transcription factors, or 

regulatory molecules. Recent therapeutic advances using triplex DNA have been described in 
the literature. (See, e.g., Gee et al. (1994) In: Huber and Carr, Molecular and Immunologic 
Approaches , Futura Publishing, Mt. Kisco NY, pp. 163-177.) Ribozymes, enzymatic RNA 
molecules, may also be used to catalyze the specific cleavage of RNA. 

20 RNA molecules may be modified to increase intracellular stability and half-life. 

Possible modifications include, but are not limited to, the addition of flanking sequences at the 
5' and/or 3* ends of the molecule, or the use of phosphorothioate or 2* O-methyl rather than 
phosphodiesterase linkages within the backbone of the molecule. Altematively, nontraditional 
bases such as inosine, queosine, and wybutosine, as well as acetyl-, methyl-, thio-, and 

25 similarly modified forms of adenine, cytidine, guanine, thymine, and uridine which are not as 
easily recognized by endogenous endonucleases may be included. 

Many methods for introducing vectors into cells or tissues are available and equally 
suitable for use in vivo, in vitro , and ex vivo . For ex vivo therapy, vectors may be introduced 
into stem cells taken from the patient and clonally propagated for autologous transplant back 

30 into that same patient. Delivery by transfection, by liposome injections, or by polycationic 
amino polymers may be achieved using methods which are well known in the art. (See, e.g.. 
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Goldman et al. (1997) Nature Biotechnology 15:462-466.) 

Further, an antagonist or antibody of a polypeptide of PSEQ or encoded by NSEQ may 
be administered to a subject to treat or prevent a cancer associated with increased expression 
or activity of PSEQ. An antibody which specifically binds the polypeptide may be used 
5 directly as an antagonist or indirectly as a targeting or delivery mechanism for bringing a 
pharmaceutical agent to cells or tissue which express the the polypeptide. 

Antibodies to PSEQ or polypeptides encoded by NSEQ may also be generated using 
methods that are well known in the art. Such antibodies may include, but are not limited to, 
polyclonal, monoclonal, chimeric, and single chain antibodies. Fab fragments, and fragments 

10 produced by a Fab expression library. Neutralizing antibodies (i.e., those which inhibit dimer 
formation) are especially preferred for therapeutic use. Monoclonal antibodies to PSEQ may 
be prepared using any technique which provides for the production of antibody molecules by 
continuous cell lines in culture. These include, but are not limited to, the hybridoma 
technique, the human B-cell hybridoma technique, and the EE V-hybridoma technique. In 

1 5 addition, techniques developed for the production of chimeric antibodies can be used. (See, 
for example, Meyers, supra.) Alternatively, techniques described for the production of single 
chain antibodies may be employed. Antibody fragments which contain specific binding sites 
for PSEQ or the polypeptide sequences encoded by NSEQ may also be generated. 

Various immunoassays may be used for screening to identify antibodies having the 

20 desired specificity. Numerous protocols for competitive binding or immunoradiometric 

assays using either polyclonal or monoclonal antibodies with established specificities are well 
known in the art. 

Yet further, an agonist of a polypeptide of PSEQ or that encoded by NSEQ may be 
administered to a subject to treat or prevent a cancer associated with decreased expression or 
25 activity of the polypeptide. 

An additional aspect of the invention relates to the administration of a pharmaceutical 
or sterile composition, in conjunction with a pharmaceutically acceptable carrier, for any of 
the therapeutic effects discussed above. Such pharmaceutical compositions may consist of 
polypeptides of PSEQ or those encoded by NSEQ, antibodies to the polypeptides, and 
30 mimetics, agonists, antagonists, or inhibitors of the polypeptides. The compositions may be 
administered alone or in combination with at least one other agent, such as a stabilizing 
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compound, which may be administered in any sterile, biocompatible pharmaceutical carrier 
including, but not limited to, saline, buffered saline, dextrose, and water. The compositions 
may be administered to a patient alone, or in combination with other agents, drugs, or 
hormones. 

5 The pharmaceutical compositions utilized in this invention may be administered by 

any number of routes including, but not limited to, oral, intravenous, intramuscular, 
intra-arterial, intramedullary, intrathecal, intraventricular, transdermal, subcutaneous, 
intraperitoneal, intranasal, enteral, topical, sublingual, or rectal means. 

In addition to the active ingredients, these pharmaceutical compositions may contain 
10 suitable pharmaceutically-acceptable carriers comprising excipients and auxiliaries which 
facilitate processing of the active compounds into preparations which can be used 
pharmaceutically. Further details on techniques for formulation and administration may be 
found in the latest edition of Remington's Pharmaceutical Sciences (Maack Publishing, Easton 
PA). 

15 For any compound, the therapeutically effective dose can be estimated initially either 

in cell culture assays, e.g., of neoplastic cells or in animal models such as mice, rats, rabbits, 
dogs, or pigs. An animal model may also be used to determine the appropriate concentration 
range and route of administration. Such information can then be used to determine useful 
doses and routes for administration in humans. 

20 A therapeutically effective dose refers to that amount of active ingredient, for example, 

polypeptides of PSEQ or those encoded by NSEQ, or fragments thereof, antibodies of the 
polypeptides, and agonists, antagonists or inhibitors of the polypeptides, which ameliorates the 
symptoms or condition. Therapeutic efficacy and toxicity may be determined by standard 
pharmaceutical procedures in cell cultures or with experimental animals, such as by 

25 calculating the ED50 (the dose therapeutically effective in 50% of the population) or LD50 (the 
dose lethal to 50% of the population) statistics. 

Any of the therapeutic methods described above may be applied to any subject in need 
of such therapy, including, for example, mammals such as dogs, cats, cows, horses, rabbits, 
monkeys, and most preferably, humans. 

30 EXAMPLES 

It is understood that this invention is not limited to the particular methodology. 
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protocols, and reagents described, as these may vary. It is also understood that the 
terminology used herein is for the purpose of describing particular embodiments only, and is 
not intended to limit the scope of the present invention which will be limited only by the 
appended claims. The examples below are provide to illustrate the subject invention and are 
5 not included for the purpose of limiting the invention. 
I cDNA Library Construction 

The cDNA library, THYMFET02, was selected to demonstrate the construction of the 
cDNA libraries from which novel matrix remodeling genes were derived. The THYMFET02 
cDNA library was constructed from microscopically normal thymus tissue obtained from a 

10 Caucasian female fetus who died at 17 weeks gestation from anencephaly. Serology was 
negative; family history included tobacco abuse and gastritis. 

The frozen tissue was homogenized and lysed in TRIZOL reagent (1 gm tissue/10 ml 
TRIZOL; Life Technologies, Rockville MD), a monoplastic solution of phenol and guanidine 
isothiocyanate, using a POLYTRON homogenizer (PT-3000; Brinkmann Instruments, 

15 Westbury NY). After a brief incubation on ice, chloroform was added (1 :5 v/v), and the lysate 
was centrifiiged. The upper chloroform layer was removed, and the RNA was precipitated 
with isopropanol, resuspended in DEPC -treated water, and treated with DNase for 25 min at 
37°C. The mRNA was reextracted once with acid phenol-chloroform pH 4.7 and precipitated 
using 0.3 M sodium acetate and 2.5 volumes ethanol. The mRNA was isolated using the 

20 OLIGOTEX kit (Qiagen, Chatsworth CA) and used to construct the cDNA library. 
The mRNA was handled according to the recommended protocols in the 
SUPERSCRIPT Plasmid system (Life Technologies). The cDNAs were fractionated on a 
SEPHAROSE CL4B column (Amersham Pharmacia Biotech, Pisctaway NJ), and those 
cDNAs exceeding 400 bp were ligated into pINCY 1 plasmid (Incyte Pharmaceuticals, Palo 

25 Alto CA) . The plasmid was subsequently transformed into DH5a competent cells (Life 
Technologies). 

U Isolation and Sequencing of cDNA Clones 

Plasmid DNA was released from the cells and purified using the REAL Prep 96 
Plasmid kit ( Qiagen). This kit enabled the simultaneous purification of 96 samples in a 96- 
30 well block using multi-channel reagent dispensers. The recommended protocol was employed 
except for the following changes: 1) the bacteria were cultured in 1 ml of sterile Terrific Broth 
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( Life Technologies) with carbenicillin at 25 mg/L and glycerol at 0.4%; 2) after inoculation, 
the cultures were incubated for 19 hours and at the end of incubation, the cells were lysed with 
0.3 ml of lysis buffer; and 3) following isopropanol precipitation, the plasmid DNA pellet was 
resuspended in 0.1 ml of distilled water. After the last step in the protocol, samples were 
5 transferred to a 96-well block for storage at 4 °C. 

The cDNAs were prepared using a MICROLAB 2200 (Hamihon, Reno NV) in 
combination with DNA ENGINE themial cyclers (PTC200; MJ Research, Watertown MA) 
and sequenced by the method of Sanger et al. (1975, J Mol Biol 94:441f) using ABI PRISM 
377 DNA sequencing systems. 
10 III Selection^ Assembly, and Characterization of Sequences 

The sequences used for coexpression analysis were assembled from EST sequences, 5* 
and 3' longread sequences, and full length coding sequences. Selected assembled sequences 
were expressed in at least three cDNA libraries. 



15 processed and verified. Quality scores were obtained using PHRED (Ewing et al. (1 998) 
Genome Res 8: 1 75-1 85; Ewing and Green (1 998) Genome Res 8: 1 86-194). Then the edited 
sequences were loaded into a relational database management system (RDBMS). The EST 
sequences were clustered into an initial set of bins using BLAST with a product score of 50. 
All clusters of two or more sequences were created as bins. The overlapping sequences 

20 represented in a bin correspond to the sequence of a transcribed gene. 

Assembly of the component sequences within each bin was performed using a 
modification of PHRAP, a publicly available program for assembling DNA fragments (Phil 
Green, University of Washington, Seattle WA). Bins that showed 82% identity from a local 
pair-wise alignment between any of the consensus sequences were merged. 

25 Bins were annotated by screening the consensus sequence in each bin against public 

databases, such as GBpri and GenPept from NCBI. The annotation process involved a FASTn 
screen against the GBpri database in GenBank. Those hits with a percent identity of greater 
than or equal to 70% and an alignment length of greater than or equal to 1 00 base pairs were 
recorded as homolog hits. The residual unannotated sequences were screened by FASTx 

30 against GenPept. Those hits with an E value of less than or equal to 1 0*^ are recorded as 
homolog hits. 



The assembly process is described as follows. EST sequence chromatograms were 
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Sequences were then reclustered using BLASTn and Cross-Match, a program for rapid 
protein and nucleic acid sequence comparison and database search (Green, supra) , 
sequentially. Any BLAST alignment between a sequence and a consensus sequence with a 
score greater than 150 was realigned using cross-match. The sequence was added to the bin 
5 whose consensus sequence gave the highest Smith- Waterman score amongst local alignments 
with at least 82% identity. Non-matching sequences created new bins. The assembly and 
consensus generation processes were performed for the new bins. 
IV Coexpression Analyses of Known Matrix-remodeling Genes 

Twenty one known matrix-remodeling genes were selected to identify novel genes that 

10 are closely associated with matrix remodeling. The known genes were osteonectin (BM-40), 
chondroitin/dermatan sulfate proteoglycans (C/DSPG), collagen I, II, II, and IV (coU-I, coU-II, 
and coU-III), connective tissue growth factor (CTGF), fibrillin, fibronectins, fibronectin 
receptor (fibr-r), fibulin 1, heparan sulfate proteoglycans (HSPG), extracellular matrix protein 
(hevin), insulin-like growth factor 1 (IGF 1), insulin-like growth factor binding protein 

15 (IGFBP), laminin, lumican, matrix Gla protein (MGP), matrix metalloproteases (MMP), and 
tissue inhibitors of matrix metalloproteinase 1, 2, and 3 (TIMP 1, 2, and 3). The protein 
products of the known matrix-remodeling genes may be categorized as follows. 

1 . Extracellular matrix component protein. These proteins include coUagens, 
proteoglycans, fibrillin, fibronectin, fibulin, and laminin that constitute the major structures of 

20 the extracellular matrix. 

2. Matrix proteases and matrix protease inhibitors. These proteins include matrix 
metalloproteases (MMPs) such as the collagenases, and MMP inhibitors such as the tissue- 
inhibitors of matrix metalloproteases (TIMPs). 

3. Regulatory proteins that control expression of matrix-remodeling genes. Such 
25 regulatory proteins include connective tissue growth factor, insulin-like growth factor, 

osteonectin (BM-40), and the receptors for and inhibitors of these proteins. 

The known matrix-remodeling genes that we examined in this analysis, and brief 
descriptions of their functions, are listed in Table 4. Detailed descriptions of their roles in 
matrix remodeling may be found in the cited articles and reviews. 
30 Table 4. Known Matrix-remodeling Genes. 
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Gene 



Description & References 



BM.40 



Alternate names: SPARC, osteonectin 

Regulates connective tissue remodeling, wound healing, angiogenesis 
Induces matrix metalloprotease synthesis (collagenase & gelatinase) 
Regulates cell movement and proliferation 

Expression increased in neoplastic melanoma, fibrosis, angiogenesis. 
(Kamihagi et al. (1994) Biochem Biophys Res Commun 200:423-8; Lane et 
al. (1994) J Cell Biol 125:929-43; Inagaki et al. (1996) Life Sci 58:927-34; 
Ledda et al. (1997) J Invest Dermatol 108:210-4; Shankavaram et ah (1997) J 
Cell Physiol 173:327-34.) 



C/DSPG 



10 



Collagens 



15 



Chondroitin/dermatan sulfate proteoglycans 

Major extracellular matrix proteoglycan 

Regulate cell proliferation, attachment and migration. 

Darnell et al. (1990) Molecular Cell Biology . Scientific American Press, New 
York NY; Toole (1991) In: Cell Biolosv of Extracellular Matrix . Plenum, 
New York NY, pp. 305-341; Beck et al. (1993) Biochem Biophys Res 
Commun 190:616-23) 

Family of fibrous structural proteins (collagen I, II, III, IV, etc.) 

Most abundant structural component of the extracellular matrix 

Secreted as procollagen; converted to collagen by MMPs 

(Alexander and Werb (1991) In: Cell Biology of Extracellular Matrix , pp. 

255-302; Adams (1993) In: Extracellular Matrix . Marcel Dekker, New York, 

NY pp. 91-1 19; Schuppan et al. (1993) In: Extracellular Matrix , pp. 201- 

254.) 



CTGF 



20 



fibrillin 



Connective tissue growth factor 

Mediates induction of matrix synthesis and fibrosis 

(Grotendorst (1997) Cytokine Growth Factor Rev 8: 171-9; Oemar and 

Luscher (1997) Arterioscler Thromb Vase Biol 17:1483-9; Ito et al. (1998) 

Kidney Int 53:853-61.) 

Major component of extracellular microfibrills (matrix elastic network) 
Present in connective tissue throughout the body 

(Kielty and Shuttleworth (1995) Int J Biochem Cell Biol 27:747-60; Haynes 
et al. (1997) Br J Dermatol 137:17-23; Hayward and Brock (1997) Hum 
Mutat 10:415-23.) 



25 fibronectins 



fibr-r 



30 



Family of extracellular matrix glycoproteins 

Anchor cells to the matrix 

Bind matrix proteins to cell surface receptors 

Fibronectin receptor 

Fibronectin receptors regulate cell adhesion & migration 
(Darnell et al. (1990) Molecular Cell Biology . Scientific American Press, 
New York NY; Ruoslahti ( 1 99 1 ) Cell Biology of Extracellular Matrix , pp. 
343-363; Yamada (1991) Cell Biology of Extracellular Matrix , pp. 1 1 1-146.) 



fibulin 1 



Fibronectin-binding extracellular matrix protein 
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Mediates platelet adhesion via a bridge of fibrinogen 
Cleaved by matrix metalloproteinases 
Inhibits breast and ovarian cancer cell motility 

(Argraves et al. (1990) J Cell Biol 1 1 1 :3 155-64; Sasaki et al. (1996) Eur J 
Biochem 240:427-34; Hayashido et al. (1998) Int J Cancer 75:654-8.) 



HSPG 



10 



hevin 



15 



Heparan sulfate proteoglycans 

Extracellular matrix proteoglycan found on cell surface of many cell types 
Regulate cell interactions with the extracellular matrix 
Bind to collagens and fibronectin in the matrix 
Regulate cell proliferation, attachment and migration 

(Darnell et al. (1990) ; Toole n99n In: Cell Biology of Extracellular Matrix ^ 
pp. 305-341 ; Schuppan et al. (1993) In: Extracellular Matrix , pp. 201-254.) 

Extracellular matrix protein 

Homolog to BM-40 

Regulates cell adhesion and migration 

Dov^nregulated in metastatic prostate cancer, lung cancer 

(Girard and Springer (1 996) J Biol Chem 27 1 :45 1 1 -7; Bendik et al. Cancer 

Res 58:232-6.) 



IGFl 



20 



Insulin-like growth factor 

Regulates matrix homeostatis and remodeling 

Regulates aggregation, grov^h and survival of cancer cells 

(Aston et al. (1995) Am J Respir Crit Care Med 151:1 597-603; Bitar and 

Labbad (1996) J Surg Res 61 : 1 1 3-9; Guvakova and Surmacz (1997) Exp Cell 

Res 23 1 : 149-62; Sunic et al. (1998) Endocrinology 139:2356-62.) 



IGFBP 



25 



Insulin-like growth factor binding protein 

Regulates IGF-1 bioavailability (binds IGF-1 more strongly than the receptor) 
Degraded by matrix metalloproteases 

(Kiefer et al. (1991) Biochem Biophys Res Commun 176:219-25; Fowlkes et 
aK (1995) Prog Growth Factor Res 6:255-63; Parker et al. (1996) J Biol Chem 
271:13523-9.) 



laminin 



30 



iumican 



35 



Major protein in basal lamina, with collagen, HSPG, and entactin 
Anchors cells to the matrix by binding collagen, HSGP and heparin 
Laminins and collagens are the main targets of MMPs 
Regulates cell attachment, migration, growth, and differentiation 
(Yamada et al. (1993) In: Extracellular Matrix , pp. 49-66; Giannelli et al. 
(1997) Science 277:225-8; Quaranta and Plopper (1997) Kidney Int 51: 1441- 
6; Soini et al. (1997) Hum Pathol 28:220-6.) 

Extracellular proteoglycan 

Organizes collagen fibrils in extracellular matrix 

(Dourado et al. (1996) Osteoarthritis Cartilage 4: 1 87-96; Scott (1996) Bio- 
chemistry 35:8795-9; Cs-Szabo et al. (1997) Arthritis Rheum 40:1037-45.) 



MGP 



Matrix Gla protein 

Regulates calcification of cartilage 
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Marker for osteoblast activity 

(Shanahan et a!. (1994) J Clin Invest 93:2393-^402; Luo et aL (1997) Nature 
386:78-81; Martinetti et al. (1997) Tumour Biol 18:197-205) 



MMP 



Family of Matrix Metalloproteases (including collagenases) 
Cleave procollagen to produce collagen 

(Alexander and Werb (1991) In: Cell Biology of Extracellular Matrix , pp. 
255-302; Adams (1993) In: Extracellular Matrix , pp. 91-1 19; Schuppan et al. 
(1993) In: Extracellular Matrix pp. 201-254.) 



5 



TIMP 1,2,3 



Tissue inhibitors of matrix metalloproteinases 
Bind and inactivate matrix proteases 

(Schuppan et al. (1993) In: Extracellular Matrix , pp. 201-254; Zvibel and 
Kraft (1993) In: Extracellular Matrix , pp. 559-580.) 



10 



The coexpression of the 21 knovm genes w^ith each other is shovm in Table 5, The 
entries in Table 5 are the negative log of the p-value (- log p) for the coexpression of the two 
genes. As shown, the method successfully identified the strong association of the known genes 
15 among themselves, indicating that the coexpression analysis method of the present invention 
was effective in identifying genes that are closely associated with matrix remodeling. 
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45 V Novel Genes Associated with Matrix Remodeling 

Using coexpression analysis, we have identified 20 novel genes that show strong 
association with known matrix remodeling genes fi-om a total of 41,419 assembled gene 



Table 5. Coexpression of 21 known matrix-remodeling genes. (- log/?) 
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sequences. The degree of association was measured by probability values and has a cutoff of 
p value less than 0.00001. This was followed by annotation and literature searches to insure 
that the genes that passed the probability test have strong association with known matrix- 
remodeling genes. This process was reiterated so that the initial 41,419 genes were reduced to 

5 the final 20 matrix-remodeling genes. Details of the coexpression patterns for the 20 novel 
matrix-remodeling genes are presented in Table 6. 

Each of the 20 novel genes is coexpressed with at least two of the 21 known genes 
with a p-value of less than 10''^. The coexpression results are shown in Table 6. 
The novel genes identified are listed in the table by their Incyte clone numbers (Clone), and 

10 the known genes their abbreviated names (Gene) as shown in Example IV. 



Table 6. Coexpression of 20 novel genes with known matrix-remodeling genes. (- log p) 
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VI Novel Genes Associated with Matrix Remodeling 

The 20 novel genes were identified from the data shown in Table 6 to be associated with 
matrix remodeling. 

40 The nucleotide sequences comprising the consensus sequences of SEQ ID NOs: 1-20 of the 

present invention were first identified from Incyte Clones 606132, 627722, 639644, 1362659, 
1446685, 1556751, 1656953, 1662318, 1996726,2137155, 2268890, 2305981,2457612, 2814981, 
3089150, 3206667, 3284695, 3481610, 3722004, and 3948614, respectively, and assembled according 
to Example III. BLAST and other motif searches were performed for SEQ ID NOs: 1-20 according to 

45 Example VII . The sequences of SEQ ID NOs: 1-20 were translated and sequence identity was sought 
with known sequences. Polypeptide sequences comprising the consensus sequences of SEQ ID 
NO:21, SEQ ID NO:22, and SEQ ID NO:23 of the present invention were encoded by SEQ ID NO:2, 
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SEQ ID NO:6, and SEQ ID NO: 1 1 , respectively. SEQ ID NOs:2 1 -23 were analyzed using BLAST 
and other motif search tools as disclosed in Example VII. 

SEQ ID NO:3 is 2987 residues in length and shows about 59% sequence identity from about 
nucleotide 2117 to about nucleotide 2914 with the cDNA encoding regulatory subunit of a human 
5 cAMP-dependent protein kinase, Rllbeta (WO 88/03 164). SEQ ID NO:8 is 301 7 nucleotides in length 
and shows about 70% to about 74% sequence identity from about nucleotide 1 to about nucleotide 
1260 and about nucleotide 1925 to about nucleotide 1985 with human Hpast mRN (g2529706), a gene 
associated with multiple endocrine neoplasia type 1. SEQ ID NO:9 is 1735 nucleotides in length and 
shows about 25% sequence identity from about nucleotide 5 to about nucleotide 1 534 with a human 

10 neuronal cell adhesion molecule (WO 96/04396) important in the development of nervous system by 
promoting cell-cell adhesion. SEQ ID NO: 14 is 2040 nucleotides in length and shows about 60% to 
70% sequence identity from about nucleotide 1 to about nucleotide 1 023 with a human mRN A for a 
serine protease (gl 62 1243) specific for insulin-like growth factor-binding proteins. The amino acid 
sequence encoded by SEQ ID NO: 14 from about nucleotide 3 to about nucleotide 1043 shows about 

15 6 1 % sequence identity with an osteoblast-like cell-derived protein (J09 1 07980) useful for treatment 
and prevention of various diseases and as contraceptive. SEQ ID NO: 15 is 2121 nucleotides in length 
and shows 60-80% sequence identity with a mouse gene, ADAMT-1 (g2809056), a member of the 
ADAM ( the disintegrin and metal loproteinase) family. ADAMT-1 has been shown to contain the 
thrombospondin (TSP) type I motif; expression of ADAMT-1 is closely associated with inflammatory 

20 processes (Kuno et al ( 1 997) Genomics 46:466-47 1 ). SEQ ID NO: 1 6 is 2900 nucleotides in length 
and shows about 70% sequence identity with a mouse homeobox (Pmx) mRN A (g460124), 
Homeobox genes are expressed in very specific temporal and spatial pattern and function as 
transcriptional regulators of developmental processes (Kern et al. (1994) Genomics 19:334-340). 

SEQ ID NO:21 is 551 amino acid residues long and shows about 37% sequence identity from 

25 about amino acid residue 10 to about amino acid residue 278 with PALM (g32 19602), a human 

paralemin that is membrane-bound and expressed abundantly in brain and at intermediate levels in the 
kidney and in endocrine cells. In addition, the sequence encompassing residues 418 to 434 of SEQ ID 
NO:21 resembles one of the structural fingerprint regions of a seven trans-membrane receptor, LCRl, 
that is isolated from the human brain (Rimland et al. (1991) Mol Pharmacol 40:869-875). SEQ ID 

30 NO:21 also has one potential amidation site at L546; three potential N-glycosylation sites at N223, 
N229, and N408; one potential cAMP- and cGMP-dependent protein kinase phosphorylation site at 
S486; fifteen potential casein kinase II phosphorylation sites at S57, SlOO, TlOl, Tl 16, S135, S253, 
T349, S370, T387, S426, T434, S489, S505, S520, and T526; one potential N-myristoylation site at 
G54; and nine potential protein kinase C phosphorylation sites at T15, S25, S57, SlOO, S123, S247, 
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S364, S370, and S505. SEQ ID NO:22 is 99 amino acid residues in length. The sequence of SEQ ID 
NO:22 from about amino acid residue 71 to about amino acid residue 81 resembles one of the 
fingerprint regions of the RHl and RH2 opsins, a family of G protein coupled receptors that mediate 
vision (Zuker et al. (1985) Cell 40:851-858; Cowman et al. (1986) Cell 44:705-710). SEQ IDNO:22 
5 also has one potential N-myristoylation site at G24, and two potential protein kinase C 

phosphorylation sites at S13 and S89. SEQ ID NO:23 is 493 amino acid residues in length and shows 
about 44% sequence identity from about amino acid residue 277 to about amino acid residue 487 with 
an angiopoietin-like factor from the human cornea, CDT6 (g2765527). Angiopoietin 1 and 
angiopoietin 2 function as a natural ligand and a natural inhibitor, respectively, for nE2, a receptor 

10 critical in angiogenesis during embryonic development, tumor growth, and tumor metastasis. The 
sequences encompassing amino acid residues 305 to 343, 346 to 355, 365 to 402, 41 1 to 424, and 428 
to 458 of SEQ ID NO:23 resemble the carboxy-terminal domain signatures of fibrinogen beta and 
gamma chains from BLOCKS analysis. SEQ ID NO:23 also exhibits one potential signal peptide 
region encompassing amino acid residues Ml to G22 when analyzed using a HMM-based signal 

1 5 peptide analysis tool. In addition, SEQ ID NO:23 shows two potential N-glycosylation sites at N 164 
and N192; one potential cAMP- and cGMP-dependent protein kinase phosphorylation sites at S127, 
six potential casein kinase II phosphorylation sites at S34, S209, T238, S266, T368, and T417; four 
potential N-myristoylation sites at G12, Gl 8, G22, and G29; eight potential protein kinase C 
phosphorylation sites at S34, S209, T268, T299, T335, S373, S383, and S477; and three potential 

20 tyrosine kinase phosphorylation sites at Yl 83, Y392, and Y467. 

VII Homology Searching for Matrix-Remodeling Renes and the Proteins Encoded by the 
Genes 

Polynucleotide sequences, SEQ ID NOs:l-20, and polypeptide sequences, SEQ IDNOs: 21- 
23, were queried against databases derived from sources such as GenBank and SwissProt. These 

25 databases, which contain previously identified and annotated sequences, were searched for regions of 
similarity using Basic Local Alignment Search Tool (BLAST; Altschul f 199Q> supra ) and Smith- 
Waterman alignment (Smith et al. (1992) Protein Engineering 5:35-51). BLAST searched for matches 
and reported only those that satisfied the probability thresholds of 1 0*^^ or less for nucleotide 
sequences and 1 0'* or less for polypeptide sequences. 

30 The polypeptide sequences were also analyzed for known motif patterns using MOTIFS, 

SPSCAN, BLIMPS, and Hidden Markov Model (HMM)-based protocols. MOTIFS (Genetics 
Computer Group, Madison WI) searches polypeptide sequences for patterns that match those defined 
in the Prosite Dictionary of Protein Sites and Patterns (Bairoch et al. supra ), and displays the patterns 
found and their corresponding literature abstracts. SPSCAN (Genetics Computer Group) searches for 
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potential signal peptide sequences using a weighted matrix method (Nielsen et al. (1997) Prot Eng 
10: 1-6), Hits with a score of 5 or greater were considered. BLIMPS uses a weighted matrix analysis 
algorithm to search for sequence similarity between the polypeptide sequences and those contained in 
BLOCKS, a database consisting of short amino acid segments, or blocks, of 3-60 amino acids in 
5 length, compiled from the PROSITE database (Henikoff et aL supra : Bairoch et al. supra ! and those in 
PRINTS, a protein fingerprint database based on non-redundant sequences obtained from sources such 
as SwissProt, GenBank, PIR, and NRL-3D (Attwood et al. (1997) J Chem Inf Comput Sci 
37:417-424). For the purposes of the present invention, the BLIMPS searches reported matches with a 
cutoff score of 1 000 or greater and a cutoff probability value of 1 .0 x 1 0'\ HMM-based protocols 
10 were based on a probabilistic approach and searched for consensus primary structures of gene families 
in the protein sequences (Eddy, supra : Sonnhammen supra ). More than 500 known protein families 
with cutoff scores ranging from 10 to 50 bits were selected for use in this invention. 

VIII Labeling and Use of Individual Hybridization Probes 

Oligonucleotides are designed using state-of-the-art software such as OLIGO 4.06 software 
15 (National Biosciences) and labeled by combining 50 pmol of each oligomer, 250 jLiCi of [y-^^P] 
adenosine triphosphate (Amersham Pharmacia Biotech), and T4 polynucleotide kinase (NEN Life 
Science Products, Boston MA). The labeled oligonucleotides are substantially purified using a 
SEPHADEX G-25 superfine resin column (Amersham Pharmacia Biotech). An aliquot containing 10"' 
counts per minute of the labeled probe is used in a typical membrane-based hybridization analysis of 
20 human genomic DNA digested with one of the following endonucleases: Ase 1, Bgl II, Eco RI, Pst I, 
Xba 1, or Pvu II (NEN Life Science Products). 

The DNA from each digest is fractionated on a 0.7 percent agarose gel and transferred to 
nylon membranes (NYTRAN PLUS, Schleicher & Schuell, Durham NH). Hybridization is carried out 
under the following conditions: 5x SCC/0.1% SDS at 60° C for about 6 hours, subsequent washes are 
25 performed at higher stringency with buffers, such as 1 x SCC/0. 1 % SDS at 45** C, then 0. 1 xSCC. After 
XOMAT AR film (Eastman Kodak, Rochester NY) is exposed to the blots for several hours, 
hybridization patterns are compared. 

IX Production of Specific Antibodies 

SEQ ID NO:20, 21, or 23 substantially purified using polyacrylamide gel electrophoresis 
30 (Harrington (1990) Methods Enzymol 1 82:488-495), or other purification techniques, is used to 
immunize rabbits and to produce antibodies using standard protocols. 

Alternatively, the amino acid sequence is analyzed using LASERGENE software (DNASTAR, 
Madison WI) to determine regions of high immunogenicity, and a corresponding oligopeptide is 
synthesized and used to raise antibodies by means known to those of skill in the art. Methods for 
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selection of appropriate epitopes, such as those near the C-terminus or in hydrophilic regions are well 
described in the art. Typically, oligopeptides 15 residues in length are synthesized using an ABI 431 A 
peptide synthesizer (PE Biosystems) using Fmoc-chemistry and coupled to KLH (Sigma-Aldrich, St. 
Louis MO) by reaction with N-maleimidobenzoyl-N-hydroxysuccinimide ester to increase 
5 immunogenicity. Rabbits are immunized with the oligopeptide-KLH complex in complete Freuntfs 
adjuvant. Resulting antisera are tested for antipeptide activity by, for example, binding the peptide to 
plastic, blocking with 1% BSA, reacting with rabbit antisera, washing, and reacting with radio- 
iodinated goat anti-rabbit IgG. 
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What is claimed is: 

1 . A substantially purified polynucleotide comprising a gene that is coexpressed with one or 
5 more known matrix-remodeling genes in a plurality of biological samples, wherein each known 

matrix-remodeling gene is selected from the group consisting of osteonectin, chondroitin/dermatan 
sulfate proteoglycans, collagen I, II, II, and IV, connective tissue growth factor, fibrillin, fibronectins, 
fibronectin receptor, fibulin 1, heparan sulfate proteoglycan, extracellular matrix protein, insulin-like 
growth factor 1, insulin-like growth factor binding protein, laminin, lumican, matrix Gla protein, 
10 matrix metalloproteases, and tissue inhibitors of matrix metalloproteinase 1, 2, and 3. 

2. The polynucleotide of claim 1 , comprising a polynucleotide sequence selected from the 
group consisting of: 

(a) a polynucleotide sequence selected from the group consisting of SEQ ID NOs: 1- 20; 

(b) a polynucleotide sequence which encodes the polypeptide sequence of SEQ ID NO: 21 , 
15 22, or 23; 

(c) a polynucleotide sequence having at least 70% identity to the polynucleotide sequence of 

(a) or(b); 

(d) a polynucleotide sequence comprising at least 18 sequential nucleotides of the 
polynucleotide sequence of (a), (b), or (c); 

20 (e) a polynucleotide which hybridizes under stringent conditions to the polynucleotide of (a), 

(b) , (c), or (d); and 

(f) a polynucleotide sequence which is complementary to the polynucleotide sequence of (a), 
(b), (c), (d), or(e). 

3. A substantially purified polypeptide comprising the gene product of a gene that is 

25 coexpressed with one or more known matrix-remodeling genes in a plurality of biological samples, 
wherein each known matrix-remodeling gene is selected from the group consisting of osteonectin, 
chondroitin/dermatan sulfate proteoglycans, collagen I, II, II, and IV, connective tissue growth factor, 
fibrillin, fibronectins, fibronectin receptor, fibulin 1, heparan sulfate proteoglycans, extracellular 
matrix protein, insulin-like growth factor 1, insulin-like growth factor binding protein, laminin, 

30 lumican, matrix Gla protein, matrix metalloproteases, and tissue inhibitors of matrix metalloproteinase 
1,2, and 3. 

4. The polypeptide of claim 3, comprising a polypeptide sequence selected from the group 
consisting of: 

(a) the polypeptide sequence of SEQ ID N0:21 , 22, or 23; 
35 (b) a polypeptide sequence having at least 85% identity to the polypeptide sequence of (a); 

and 
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(c) a polypeptide sequence comprising at least 6 sequential amino acids of the polypeptide 
sequence of (a) or (b). 

5. An expression vector comprising the polynucleotide of claim 2. 

6. A host cell comprising the expression vector of claim 5. 

5 7. A pharmaceutical composition comprising the polynucleotide of claim 2 or the polypeptide 

of claim 3 in conjunction with a suitable pharmaceutical carrier. 

8. An antibody which specifically binds to tlie polypeptide of claim 4. 

9. A method for diagnosing a disease or condition associated with the altered expression of a 
gene that is coexpressed with one or more known matrix-remodeling genes, wherein each known 

10 matrix-remodeling gene is selected from the group consisting of osteonectin, chondroitin/dermatan 
sulfate proteoglycans, collagen I, II, II, and IV, connective tissue growth factor, fibrillin, fibronectins, 
fibronectin receptor, fibulin 1, heparan sulfate proteoglycans, extracellular matrix protein, insulin-like 
growth factor 1, insulin-like growth factor binding protein, laminin, lumican, matrix Gla protein, 
matrix metalloproteases, and tissue inhibitors of matrix metalloproteinase 1, 2, and 3, the method 

1 5 comprising the steps of: 

(a) providing a sample comprising one of more of said coexpressed genes; 

(b) hybridizing the polynucleotide of claim 2(F) to said coexpressed genes under conditions 
effective to form one or more hybridization complexes; and 

(c) detecting the hybridization complexes, wherein the altered level of hybridization 
20 complexes compared with the level of hybridization complexes of a nondiseased sample 

correlates with the presence of the disease or condition. 

1 0. A method for treating or preventing a disease associated with the altered expression of a 
gene that is coexpressed with one or more known matrix-remodeling genes in a subject in need, 
wherein each known matrix-remodeling gene is selected from the group consisting of osteonectin, 

25 chondroitin/dermatan sulfate proteoglycans, collagen I, II, II, and IV, connective tissue growth factor, 
fibrillin, fibronectins, fibronectin receptor, fibulin 1, heparan sulfate proteoglycans, extracellular 
matrix protein, insulin-like growth factor 1, insulin-like growth factor binding protein, laminin, 
lumican, matrix Gla protein, matrix metalloproteases, and tissue inhibitors of matrix metalloproteinase 
1, 2, and 3, the method comprising the step of administering to said subject in need the pharmaceutical 

30 composition of claim 7 in an amount effective for treating or preventing said disease. 

1 1 . A method for treating or preventing a disease associated with the altered expression of a 
gene that is coexpressed with one or more known matrix-remodeling genes in a subject in need, 
wherein each known matrix-remodeling gene is selected from the group consisting of osteonectin, 
chondroitin/dermatan sulfate proteoglycans, collagen I, II, II, and IV, connective tissue growth factor. 
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fibrillin, fibronectins, fibronectin receptor, fibulin 1, heparan sulfate proteoglycans, extracellular 
matrix protein, insulin-like growth factor 1, insulin-like growth factor binding protein, laminin, 
lumican, matrix Gla protein, matrix metalloproteases, and tissue inhibitors of matrix metalloproteinase 
1, 2, and 3, the method comprising the step of administering to said subject in need the antibody of 
5 claim 8 in an amount effective for treating or preventing said disease. 

12. A method for treating or preventing a disease associated with the altered expression of a 
gene that is coexpressed with one or more known matrix-remodeling genes in a subject in need, 
wherein each known matrix-remodeling gene is selected from the group consisting of osteonectin, 
chondroitin/dermatan sulfate proteoglycans, collagen I, II, II, and IV, connective tissue growth factor, 
10 fibrillin, fibronectins, fibronectin receptor, fibulin 1, heparan sulfate proteoglycans, extracellular 
matrix protein, insulin-like growth factor 1, insulin-like growth factor binding protein, laminin, 
lumican, matrix Gla protein, matrix metalloproteases, and tissue inhibitors of matrix metalloproteinase 
1, 2, and 3, the method comprising the step of administering to said subject in need the polynucleotide 
sequence of claim 2(F) in an amount effective for treating or preventing said disease. 
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SEQUENCE LISTING 



<110> INCYTE PHARMACEUTICALS, INC, 
WALKER, Michael G. 
VOLKMUTH, Wayne 
KLINGLER, Tod M. 

<120> MATRIX-REMODELING GENES 

<130> PB-0004 PCT 

<140> To Be Assigned 
<141> Herewith 

<150> 09/169,289 
<151> 1998-10-09 

<160> 23 

<170> PERL Program 



<210> 1 

<211> 1447 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> unsure 
<222> 1380 

<223> a or g or c or t, unknown, or other 
<220> 

<221> misc_feature 

<223> Incyte ID No.: 606132CB1 



<400> 1 

cctggaacca 

tgttcgcgca 

gagagcagga 

gcgaggagct 

cctagctggc 

ccaccgcctc 

ccctgcacac 

agctgacggc 

gcggccctct 

ggcgtcccag 

ggtctcaaaa 

ttccagccag 

gaagatccaa 

ggcttaattc 

ctgtggcctt 

gctggggccc 

ttggtagaag 

tgcagatgtt 

cctgtgccac 

tctgtgccgc 

ttataaaaca 

cttccacgtg 

gcgctcatcc 

gggagagctg 

gagacaa 



gaaggagacc 
ggtgggcgac 
ccaggtgtgg 
ggacacctac 
cggccacctc 
ttccccgatc 
acagaaagcc 
agatgaaatc 
ctgcacacat 
ggtgcggcac 
ggagcaaagt 
cctgctggct 
agttaaagct 
ttttaagcca 
gcccaagggc 
caggcgtcag 
cagccgaagg 
tctatgaggg 
cccagagccc 
ctcccacaca 
cctcaagcag 
tgttgtgttg 
aggcctctga 
gacggggtag 



tacctgcaca 
cgcagcatca 
gtacgcctct 
atcaccttca 
ctttcctctc 
cctggactcc 
aaagcgatcg 
accagggcgg 
cctcaagtga 
cgcggctcca 
aaaccgtgga 
cccaagagag 
ctggggtcag 
cgtaggaact 
tctgctggtc 
ctcccagagg 
gctcctgaca 
gcagagctcc 
tggggggtgg 
aatcagcccc 
cactgcagtc 
gttggcagca 
ccagtagcct 
acagtccgct 



tcatgaagaa 
tgcaaagcca 
acaagggcga 
gtggctacct 
gccaccttcc 
gactccctgg 
gtgctcccag 
ggcacccgcg 
ccccgcacgg 
gtccttggaa 
ggacaaagaa 
aggccttttc 
gggaggggcc 
ttcttgaggg 
tttctgagtc 
gacagctgag 
gtggccaggg 
tggtacatcc 
tctccatgcc 
agaaggcccc 
•tcccatctcc 
aggctgatcc 
gagaggggct 
tgtctgttct 



cgaggaggag 
gagcctgatg 
acgtgagaac 
ggtcaagcac 
acccctgcgc 
ctttggcatt 
atcccgcagc 
agaaccctct 
cgagacgcgg 
ataattaggc 
aagggttgtt 
agttgagact 
gggggcagga 
ataggtggac 
acagctgcga 
ccccctgcct 
acccctgggt 
atgtgtggct 
tgccaccctg 
ggggccttgg 
tcgtgggcta 
agaccccttc 
ttttctaggc 
aagctctgtg 



gtggtgatct 
ctggagctgc 
gccatcttca 
gccaccgagc 
tgtgctgacc 
cagtgagacg 
ctctggagag 
gggaccttcc 
gtggcggcag 
aaattctaaa 
atttttgtct 
ctgcttaaga 
aactacctct 
cctgacatcc 
ggtgatgggg 
tggctccagg 
cccccaggcc 
ctgctccacc 
gcatcggctt 
cttctgtttt 
agcatcaccg 
tgcccccact 
ttcagagcan 
agctcagtct 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1447 
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<210> 2 
<211> 2481 
<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_f eature 

<223> Incyte ID No.: 627722CB1 



<400> 2 

ctagcaagca ggtaaacgag ctttgtacaa 
ctgtgtgttg ctagagcaga ggctgattaa 
cctggaaaat aatgaattgg gtaaggaaca 
gcacattaca acaaagagct ggcagctcct 
aaacttgtca gtcaactcat gccagcagcc 
tacatgtgtc tgtctggcct gatctgtgca 
aatttctcta tttctccact ggtgcaaaga 
ccccgctcct ctcccccagg aggctccttg 
tctgactgtc cttgacttct agaatggaag 
ccatcacaga taaaagaaaa atacaggaag 
aagacaaact aaagcaccag catttgaaga 
atggaatcag cagcggaaaa gaacaggaag 
accagatcca ggttctagaa caaagtatcc 
aaaaagctga actgcaaatc tcaacgaagg 
ttgagcggac aacagaagac attataagat 
aagagtcaat tgaggacatc tatgctaata 
ctaggttaag gaaggagata aatgaagaaa 
tatatgccat ggaaattaaa gttgaaaaag 
cttcaatacc tctgccatca gatgacttta 
ggcaaaagtc agtgtatgca gtaagttcta 
gcctggcacc agttgaagta gaggaacttc 
ccccaacaga gtatcatgag cctgtatatg 
agagagaaac ggtgacccct ggaccaaact 
gactgggtat tggtgtaaat gaatccatac 
ggggaaacaa cttcaatcac atcagtccca 
ttcaacaagc agaagagaag cttcacaccc 
aatcgaatgt catgcaggac aaagatgcac 
agacaatatt tgggaaatct gaacaccaga 
aagatgtcag atataatatc gttcattccc 
tgacaatgat tttcatgggg tatcagcagg 
tgacaggata tgatgggatc atccatgctg 
aggatgaagg agaagcagag aaaccgtcct 
accagccagc caaaccaaca ccacttccta 
acacaaatca taaatccccc cacaaaaatt 
taggcagccc tgtccaccat tccccatttg 
atccatcctt aacagcttta aggatgagaa 
agttgtacca cctatataaa catcctttga 
tcttctggat attttgttta ttttttctga 
tattaagcca tgtgaataag tagtagtcat 
aaaacaaatg tgtaactttt ccagttactt 
ttttattcta ttgataccaa agcatttcta 
tatttaaaat aaaaaaaaaa a 



acacacacag accaacacat ccggggatgg 60 
acactcagtg tgttggctct ctgtgccact 120 
gttaataaga aaatgtgcct tgctaactgt 180 
gaaggaaaag ggcttgtgcc gctgccgttc 24 0 
tcagcgtctg cctccccagc acaccctcat 300 
tctgctcgga gacgctcctg acaagtcggg 360 
gcggatttct ccctgcttct cttctgtcac 420 
atttatggta gctttggact tgcttccccg 480 
aagctgagct ggtgaaggga agactccagg 540 
aaatctcaca gaagcgtctg aaaatagagg 600 
aaaaggcctt gagggagaaa tggcttctag 660 
agatgaagaa gcaaaatcaa caagaccagc 720 
tcaggcttga gaaagagatc caagatcttg 780 
aagaggccat tttaaagaaa ctaaagtcaa 84 0 
ctgtgaaagt ggaaagagaa gaaagagcag 900 
tccctgacct tccaaagtcc tacatacctt 960 
aagaagatga tgaacaaaat aggaaagctt 1020 
acttgaagac tggagaaagt acagttctgt 1080 
aaggtacagg aataaaagtt tatgatgatg 1140 
atcacagtgc agcatacaat ggcaccgatg 1200 
taagacaagc ctcagagaga aactctaaat 1260 
ccaatccctt ttacaggcct acaaccccac 1320 
ttcaagaaag gataaagatt aaaactaatg 1380 
acaatatggg caatggtctt tcagaggaaa 1440 
ttccgccagt gcctcatccc cgatcagtga 1500 
cgcaaaaaag gctaatgact ccttgggaag 1560 
cctctccaaa gccaaggctg agccccagag 1620 
attcttcacc cacttgtcag gaggacgagg 1680 
tgcctccaga cataaatgat acagaaccgg 1740 
cagaagacag tgaagaagat aagaagtttc 1800 
agctggttgt gattgatgat gaggaggagg 1860 
accaccccat agctccccat agtcaggtgt 1920 
gaaaaagatc agaagctagt cctcatgaaa 1980 
ccatatctct gaaagagcaa gaagaaagct 2040 
atgctcagac aactggagat gggactgagg 2100 
tggcaaagct gggaaaaaag gtgatctaag 2160 
agaagaaact aagaagcatt tgcaaatttc 2220 
agtccaaaaa attatcatta cagtgtacca 2280 
tatttgtgaa aaattcccaa aaagctgggg 234 0 
gacacgattc agtgggggaa aaccagcatt 2400 
ataagagctt gttaaattta agaataaagt 24 60 

2481 



<210> 3 

<211> 2987 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> unsure 
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<222> 2955 

<223> a or g or c or t, unknown, or other 



<220> 

<221> misc_feature 

<223> Incyte ID No.: 639644CB1 

<400> 3 

agaaaaaaag aaaaaagaaa aaaactaagg 
agaatcggta aactgctttc acgttggctt 
tctggcaagc ttcgaattca caagtgtaaa 
agcacgcgga acctctacgc tcgcggcttc 
gagtctggtt accgtgccag cagaagccaa 
caggggactc caaagtacaa gcccagattt 
gtcgaatttg aaggtgaaat atatgacata 
ttgcaaccaa gaaacattgc taagcgtcat 
caggcttcca gtggtggcaa caggggcagg 
ccacctacca ctgtccgagt gacacacaag 
tgtgagagag aactgtacca atcggccaga 
aaagagattg aagctctgca agataaaatt 
aagagaagga agcctgagga atgtagctgc 
ggtgtaaaaa agcaagagaa attaaagagc 
gaagtagata gcaaactgca acttttcaag 
aaggagaaga gacggcagag gaagggggaa 
acgcatgaca acaaccactg gcagacagcc 
tgcacgagtt ctaacaataa cacctactgg 
tttcttttct gtgagtttgc tactggcttt 
tatcagctca caaatacagt gcacacggta 
caactaatgg agctcagaag ctgtcaagga 
cttgatgttg gaaataaaga tggaggaagc 
ggatgggaag gttaatcagc cccgtctcac 
gctacacagt gtgaatgaaa acatctatga 
ggactggact aattacttga aggatttaga 
tgagcaaaat aaaacaaata agactcaaac 
tgctgagcac gctgtgtcaa tggagatggc 
taaggttggg aaaacacctc atttgacctt 
ccgaccaaca ttaagtccag agagtaaact 
tcatttgaat tctgaacact ggagaaaaac 
tcatctggaa accgatttca gtggcgatgg 
caggctgcag cccattcgca ggcacccgaa 
ggacattttt gaagatcaac tatatcttcc 
gatgttcacc atggccaccg cagaacaccg 
gaccaaggtg gagaagaatc acgaaaagga 
ctcttcactc tcctctgatt agatgaaact 
ttaacttttt tatttgtaaa ctaataaagg 
ctgggtacct ttgtgcagta gaagctagtg 
actcatcgtt ataatttact atctgccaag 
tggcttggtt ttgatttttt gcttgtttgt 
gaatatcgta gggacataag tatatacatg 
tttctgagtg tctaaaactt gacacccctg 
cgtaatgaag ttttgattca tttttaacca 
tagatgattt tgcactttga gattaaaatg 
atttttacag gcttatcagt ctcactgttg 
ccaaggacga cacacagtat ggatcacata 
gttgcatgtg ttttacctcg acttgctaaa 
gttggtggtg aaaataaata aataagtaaa 
aaaaaaaaaa aaaaaaaaaa aaaaagcaaa 
catgaggatc cgagngggtc gcctctttga 



cagcagctct taataaataa cacctggagc 60 
ttgcagaagt ggcaatgcat tgaggataca 120 
ggacccagtg acctgctcac agtccggcag 180 
catgacaaag acaaagagtg cagttgtagg 240 
agaaagagtc aacggcaatt cttgagaaac 300 
gtccatactc ggcagacacg ttccttgtcc 360 
aatctggaag aagaagaaga attgcaagtg 420 
gatgaaggcc acaaggggcc aagagatctc 480 
atgctggcag atagcagcaa cgccgtgggc 54 0 
tgttttattc ttcccaatga ctctatccat 600 
gcgtggaagg accataaggc atacattgac 660 
aagaatttaa gagaagtgag aggacatctg 720 
agtaaacaaa gctattacaa taaagagaaa 780 
catcttcacc cattcaagga ggctgctcag 84 0 
gagaacaacc gtaggaggaa gaaggagagg 900 
gagtgcagcc tgcctggcct cacttgcttc 960 
ccgttctgga acctgggatc tttctgtgct 1020 
tgtttgcgta cagttaatga gacgcataat 1080 
ttggagtatt ttgatatgaa tacagatcct 1140 
gaacgaggca ttttgaatca gctacacgta 1200 
tataagcagt gcaacccaag acctaagaat 1260 
tatgacctac acagaggaca gttatgggat 1320 
tgcagacatc aactggcaag gcctagagga 1380 
gtacagacaa aactacagac ttagtctggt 144 0 
tagagtattt gcactgctga agagtcacta 1500 
tgctcaaagt gacgggttct tggttgtctc 1560 
ctctgctgac tcagatgaag acccaaggca 1620 
gccagctgac cttcaaaccc tgcatttgaa 1680 
tgaatggaat aacgacattc cagaagttaa 1740 
cgaaaaatgg acggggcatg aagagactaa 1800 
catgacagag ctagagctcg ggcccagccc 1860 
agaacttccc cagtatggtg gtcctggaaa 1920 
tgtgcattcc gatggaattt cagttcatca 1980 
aagtaattcc agcatagcgg ggaagatgtt 204 0 
gaagtcacag cacctagaag gcagcgcctc 2100 
gttaccttac cctaaacaca gtatttcttt 2160 
taatcacagc caccaacatt ccaagctacc 2220 
agcatgtgag caagcggtgt gcacacggag 2280 
agtagaaaga aaggctgggg atatttgggt 2340 
ttgttttgta ctaaaacagt attatctttt 2400 
ttatccaatc aagatggcta gaatggtgcc 2460 
gtaaatcttt caacacactt ccactgcctg 2520 
ctggaatttt tcaatgccgt cattttcagt 2580 
ccatgtctat ttgattagtc ttattttttt 2640 
gctgtcattg tgacaaagtc aaataaaccc 2700 
ttgtttgaca ttaagctttt gccagaaaat 2760 
atcgattagc agaaaggcat ggctaataat 2820 
caaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2880 
aaaagctgcc gccacagtta gatgaagaag 294 0 
gtggtgaggg agtcgcg 2987 



<210> 4 
<211> 2915 
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<212> DNA 

<213> Homo sapiens 



<220> 

<221> inisc_feature 

<223> Incyte ID No.: 1362659CB1 

<400> 4 

gaggcaagaa ttcggcacga gggacatttt 
acccggcaca ctcccccttc ctccagcccc 
gtaaagccct gtgccttttt ttcccctgaa 
ctctcaggag ttggggactt tgctaggaga 
ggagccacgt ttgcaggagc tccatttgta 
ccagttcatg tccctgactc tcacctccca 
gagtgatgag agtcaagaag aggggatgta 
atgacctgat ccacctagcc ttttcttctg 
gctgtccaca gtaggaaaca taaagaaaca 
atccatcatc gtaggaaata ggaaagcaaa 
ttcaataatt ctttttttgt gtcttaaata 
gtgttaggtt tcacatatat attcatcaac 
tgtattacct cagatcattt taaatagcaa 
cattcctgtt cacaaaaggt tctcatggtg 
tactttttaa aagtcaatgg ttttttttct 
atatagaaat atatgcaaaa attatagttt 
cagccatatg tattttgttt aaaggattta 
agggagcaca taaccagctg tttggcatga 
taaaaccaat acaccatact ttctttctgc 
aattgttggg ttctagactt ttttaatata 
aagtgtctat gtgcatatgt tttttatata 
ctggcagtgg gtaaatatgg cataagttaa 
tttgaaaagg gtctgatggg gagaaggaga 
acctagaaaa acgggtagta aactgtggat 
ctgtcaggaa atgaatcttc cccccaaccc 
cctgactagt cattaggatc aggggcctct 
gcagagtggt ataaaagaca cgaatatctc 
ttgcattttt catggttttt atttcctgtt 
gtgcaaggat cttatttgtg atgccttccc 
aatggggaca gaattctaaa tggataaaac 
ttaaggctag atccttccca tagtatcatc 
aggggttaag agagagatca cctagaaatc 
tgagtttctt cttccccttg agcttcagag 
ttacctcact gctgaaaacc cagaggggcg 
aaatgcatcc cttcctttct ttcctgcttc 
accatcacag tatgcagaga cttcctcacc 
tggtgagggt gggcacttat aaatgcctgc 
gaatagacca gacgcccttt cacttagttc 
ctgttaggcc tgctgttccc tttgctcttg 
ataactgaat tggcctttgg ttcatgtttt 
tatgccatat atatgtgcca acaaatctat 
ataggaattt tgagtttctt cttcttttag 
tacaaaaaag aaaaacaaaa gattgactat 
gtgatatcaa agcaacgtat accccagtcc 
ttaacagtgc acccaatcta tatttgcatt 
tttgcatgta tttatatggt tcttagggaa 
tcaaatgtgt tgttccactg agaccagaag 
ttggagccaa taaagctttt tgctgatgaa 
aaaaactcat gcccacttgt aaaaaaaaaa 



gccaacttaa acgagaaaaa gaccccccgc 60 
gcttcagcca catgctccag ctgctgccca 120 
tactgcccaa agcatcccct tcccatctgc 180 
ttttttaagt gttccttact gggacaacgt 240 
tccctgctgg tgttgacttc tgtgtagggg 300 
ttagataaat gaagcccacc cccctttcta 360 
tgaacggcca aattcccatg tgagaggaag 420 
gatctgtcct ccctcacccc tttcacctga 480 
atgtccccta catatcccca tgactacata 540 
tttgattttg gttttgtaaa acgtacatgc 600 
ctcatagggg aaaaaaacag ctcacccaag 660 
tattttagaa gatttaattc tatcaaatct 720 
gccaataacg agctttgaag gctattttac 780 
cctgacaggt tacccttgag ggcttgtgtc 840 
tgtgttctag tttccataat aggagagaaa 900 
tctttagatc agaaactgat atttttgggt 960 
aaataaagtg ccgtcatgta gccctgtgga 1020 
caggtgactt agtatatttg taattggttt 1080 
aaacagccat ctttatactt agggaagaaa 1140 
aattttgttg atatggaatt aggtaagttt 1200 
agttttttct attcagtttc actgatccaa 1260 
taacactttt ccccaaaatg gtgctttgga 1320 
acgtatcatc ctagcttcct ctcttaataa 1380 
agtcaggaaa acacccagca agggacacag 1440 
ccaccatgca gatggataga cagaatcttt 1500 
gttggatttg tgtttcttga agaatagctg 1560 
ctggtctata aggatactct gatttggggt 1620 
ccccctggag ttttccatta gtgagttttt 1680 
tcccctagaa agattttgtg caatatatta 174 0 
aatggctggt tctagccctg agtgacagtc 1800 
tgtcctctgg aatgactctc ctgtccctaa 1860 
cctctggaca cttgtgggtt ctttagggtt 1920 
aggagagttg gcatggttaa atctgaatgg 1980 
tggcacactc gcttgtgtgg aaaagcctct 2040 
ctttgcctta caattgaagc agcccgtggt 2100 
tttcatatct agggaccacc cccgatgcat 2160 
tattgttaag ccattccagc ctcttcctct 2220 
agtgccagtc cttttgcctt cccaaccctg 2280 
attaggagag atggaaggag atgagctccc 2340 
ctccccatat gtatatatgc catatgtgaa 2400 
ctacgttgtt cttttcaaat tagcacgcag 2460 
taactagtat aacaagcact ggtatttttg 2520 
tgtggtctgc atgacataaa caaacaaatg 2580 
agtgtgtgtt gccataattt gcaattcagc 2 64 0 
ttgatattat ttaagctcta tgtacaaggt 2700 
aaaaaatgct ataaactgca aatctgaaat 2760 
aagaagagga gttttaaaag ggataatttg 2820 
cagaaaccaa tactgctgtg cactgagaat 2880 
aaagg 2915 



<210> 5 



4/20 



wo 00/21986 

<211> 1826 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> niisc_f eature 

<223> Incyte ID No. : 1446685CB1 



PCT/US99/23315 



<400> 5 

gaaagccgca 

cccgaccggc 

ctgtccccaa 

tggggaaact 

aggaggaggg 

ccgaggacac 

gtgaccccaa 

ttggagaaag 

agaagctttt 

agattgctca 

ttcctcccag 

tcttacacct 

atgtttccat 

tccaagagga 

ttgacacctt 

ctttcgtgaa 

ttgcagatgg 

tgcacagctt 

cctttgagct 

tcaactgtga 

acgtggagtg 

ctggaccctc 

cacaagtcca 

aactcagtgg 

ggaaggttgt 

ttaggcaaaa 

taactgtgtg 

ttcaggactc 

ttggaggtta 

ccttctccct 

tagctaagta 



gcctcagtcc 
ccgcggcagc 
gtctcccact 
cggagggacc 
aatgaacgcc 
gatgctggag 
gcttcaagaa 
aatcattgtg 
cgagaaactg 
gaagcaaaaa 
gagcatcaag 
gctcgttgct 
ccaagtggtt 
aataactggt 
gttcgaccat 
caagcacctg 
ggtgtacctg 
cttcctgacc 
catgcaagat 
cctgaaatct 
aggggctgcc 
ctccgaactg 
gctgcaaccc 
gctgacccat 
tcccttcccg 
gagtccccac 
tcaggcccca 
ccattgacgt 
aatgacttgc 
aaaggtaacc 
tgcattcctc 



cgccgccgcc 
ctgcgccgcg 
cccaagtcgc 
ctggcccgga 
atcaacctgc 
gagaatgagg 
ctgatgaagg 
aaagacctag 
gagagtgaga 
ctgcagactg 
tggaatgtgg 
ctgtctcagt 
gtggtccaga 
aacacagagg 
gccccagaca 
aataaactga 
gtgctgctca 
ccggacagct 
ggagggttgg 
acactacgag 
ctgggcccac 
ccttaccctg 
agagatagtg 
ccctcccagg 
gtgccaggtc 
aagatgaaaa 
cactaagtgc 
aggtgtttca 
cagaagttgg 
actattctga 
aatagt 



cgctgcgtcc 
ccatggccac 
ccccgtcccg 
ggaagaaagc 
ccctcagccc 
tgcgaacaat 
tattaattga 
ctgaagattt 
agctaaatgt 
tcctggagaa 
attctgttca 
atttccgcgc 
aacgagaagg 
ctctttccgg 
agctgaatgt 
acctggaggt 
tggggctcct 
ttgaacagaa 
aaaagccaaa 
tgttgtacaa 
cactgcccaa 
cttattcctg 
gaaactgaaa 
cgctggggac 
cagatttccc 
taaagatcct 
tctgctctga 
ttcccctttt 
aatttttttc 
gtccaatcat 



gcccagcgcc 
ctccccgcag 
caagaaagat 
caaggaggtg 
aattcccttt 
ggtggatcca 
ctggattaat 
gtatgatgga 
ggctgaggtc 
gatcaatgaa 
tgccaagagc 
accaattcga 
aatcctccag 
gaggcatgaa 
ggtgaaaaag 
cacagaactg 
ggagggctac 
ggtcttgaat 
accgcggcca 
cctcttcacc 
gagttcttgc 
tctcttgcac 
ttaggaagga 
caacctagca 
tccatgattt 
agttaccatt 
tatactcaag 
acagatgagg 
ctctttgaac 
caaggttttg 



agctccgcgt 
aagtcgcctt 
gattccttct 
tccgagctgc 
gagctggacc 
aactcacgca 
gatgtgttgg 
caagtcctgc 
acccagtcag 
accctgaaac 
ctggtggcca 
ctcccagacc 
tctcggcaaa 
cgtgatgcct 
acactcatca 
gaaacccagt 
tttgtgcccc 
gtctcctttg 
gaagacatag 
aagtaccgta 
tgttggcgta 
tgtgctctcc 
aatcatcaat 
atgaaggttg 
gggaaccagg 
caaaggatgc 
gccattaatc 
aaactaaggc 
ataacctctc 
cttttctttt 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1826 



<210> 6 

<211> 1439 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No.: 1556751CB1 

<400> 6 

gagtatccct tgtttaatca cttttgtggt 
ttccttgaag agtttagccc tggctcactt 
agaagaaaaa aagagacaaa ttacccagaa 
caaatgttaa ttttcctaga aaatccttca 
ctcagggtgg cttctgcgtc cccgccgcca 
gttcctcccc gggactccag aatttctctc 
gttggcaaaa cgcagggccg gctcccaaaa 
gtccccaggc ctcccagcgc aaacttaaag 
gccagctggg ctttttaaca acctagagac 
ggaaacgggg cttgccagag acactcacag 
gaatctccac atcattgtct ttcttgtgcc 



taaaagagac ctttgggtca gtctgcctca 60 
ttcactctat ttcttctcct gtctcaagaa 120 
acccctccct tccccacatg gaggccttgg 180 
gacctgaaga cgcaggaaaa gaatctggct 24 0 
ggccccagac tatggtcaca gggccgtcct 300 
ctcaaaggaa agaaaacagg gcatgcgctt 360 
accccatgtg tgtacgatta aaagttggcc 420 
agacagggct ttgctgaaaa ccaaacatgg 480 
tttccggagc tgcctggaac agagcctgcg 540 
tttccttcat ggcctgtttt ggtcccctaa 600 
ttttccttgg tgagcaacag aaagggaagg 660 



5/20 



wo 00/21986 



PCT/US99/23315 



gttccaagcc 
gctgcccttt 
agaaagcgct 
gaaaagtagg 
gtacgtaaaa 
tggatggatg 
acaaaaccaa 
tttgggcagc 
agccagtgca 
ctgagatggg 
cctctgcgcc 
ggtacgtcca 
tttagaaggt 



tctaaaaatg 
cgaggccagt 
gagagctcgc 
tttcttggct 
caaaataggg 
gatgtatgga 
agctgattgg 
tttgagaagc 
tttattttaa 
tccacgcatc 
cattctcttc 
taaagccagt 
aaacaaattt 



tgctttgtga 
gagctcagcc 
aggttcatta 
tgatgtagac 
cttggctggt 
tgaatagata 
aaacaattaa 
ggtacaagag 
gctcttagaa 
tctctacact 
tcacgcatat 
attacactta 
aataaagcta 



tcaggagtgc 
tccaaggctt 
aagaaggcaa 
tggcttgctt 
caaaggagac 
gatggtgttt 
ttgtgggtgt 
ttctgtgcct 
gcaactcctt 
tccttctctc 
ccatgagctt 
aatgaagtat 
ccaataatga 



gctccaaacc 
taaagccaca 
agcactggtt 
tgatttttag 
aagcaggatg 
gcatgtaaat 
ctgaggggga 
gtgtgtccag 
ggcccaggaa 
cgtgggatac 
taatttcact 
tcttttttgt 
gaaaaaaaaa 



aaatacgcgc 
tttcagcaag 
tctctcctta 
tgaagggaat 
gatggatgga 
tgcagagaaa 
aggtcgcagc 
ccctggagcc 
tgcgtgaccc 
tggactcgtg 
ttctgatcac 
aatcgttttt 
aaaaaaaaa 



720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1439 



<210> 7 
<211> 3047 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_f eature 

<223> Incyte ID No.: 1656953CB1 

<400> 7 

cgagacagag gaaatgtgtc tccctccaag 
ggttttgcct tagcaatgca tcggtctctg 
caaggtgcag ggttaatact cttgccagtt 
ttttaataga aaactaaagg ggcaggggaa 
tcgatggggc atttggaact tctttttaaa 
aactctggtg tttaacactt aagggagaca 
ggccacgaga ctctaggtga tgtgtgaagc 
ctgtctggcc attcagagga ttctaaagac 
cacttaaata aatgcaaatg caacatttct 
tcatttgggg tgaaggagac atttctgtcc 
gtatgattcc tgggatccaa cgagccctcc 
gcccaggccc atcgtctgtt ctctgaatgc 
gaacccctct gtggaaccca caaggggaga 
accttccctg gcaggctggg tccctctcct 
cccaccacgg ggggagagcc agcaacccaa 
aaccactggg ctcaaacacg tgctttattc 
atggaaattc ttgtttgggg gatcttgggg 
gccaagaggc cattaacaaa tcgtccttgt 
cacagtgggg aatccaaggg tcacagtatg 
tctcgctaga cacagtgttt ctgcccaggt 
catggggacg ggggaagttt tcacttggag 
ccaaataggt caataattct gggagactct 
tctctccctc ccctcatccc acatctcaaa 
cacccagctc gccatgccta ctcattcctg 
cttctttgtc atttgagaaa ggatgcagga 
cagaaaaacc agggcaggac agttatcgac 
tagagggact ccacccctgc tcaacagctt 
ctctgccttc ggtggcccac acacctaagc 
aacacatcta cgtgtagcac tacgacgtta 
aggctctgat taaggatgtg gggaagtggg 
ctggaggcct gtctgttagc cagtggtgga 
tgccatcttc cctgcgatca ggcaaaaaag 
tgtgttatgt ccattttgca ggatgaactg 
ttgctttgtc ttttccatcc tcatcacaag 
tctttcgatg gatggagatg atcattaggt 
atttctgtga aaactaggag aacagagatg 
taacacagtc tttttaaaac taacatagga 
tctccattgt ctaaatcagg aaaacaggaa 



gccccaaagc ctcagagaaa gggtgtttct 60 
aggtgacact ctggagcggt tgaagggcca 120 
ttgaaatata gatgctatgg ttcagattgt 180 
gtgaaaggaa agatggaggt tttgtgcggc 240 
gtcatctcat ggtctccagt tttcagttgg 300 
aaggctgtgt ccatttggca aaacttcctt 360 
tgggcagtct gtggtgtgga gagcagccat 420 
atggctggat gcgctgctga ccaacatcag 480 
ccctctgggc cttgaaaatc cttgccctta 540 
ttggcttccc acagccccaa cgcagtctgt 600 
tattttcaca gtgttctgat tgctctcaca 660 
agccctgttc tcaacaacag ggaggtcatg 720 
aatgggtgat aaagaatcca gttcctcaaa 780 
gctgggtggt gctttctctt gcacaccact 840 
ccagacagct caggttgtgc atctgatgga 900 
tcctgtttat ttttgctgtt actttgaagc 960 
ctacagtagt gggtaaacaa atgcccaccg 1020 
cctgaggggc cccagcttgc tcgggcgtgg 1080 
gggagaggtg caccctgcca cctgctaact 1140 
gacctgttca gcagcagaac aagccagggc 1200 
atggacacca agacaatgaa gatttgttgt 1260 
tggaaaaaac tgaatatatt caggaccaac 1320 
gcagacaatg taaagagaga acatctcaca 1380 
aatttcaggt gccatcactg ctctttcttt 1440 
ggacaattcc cacagataat ctgaggaatg 1500 
aatgcattag aacttggtga gcatcctctg 1560 
ggcttccagg caagaccaac cacatctggt 1620 
gtcatcgtca ttgccatagc atcatgatgc 1680 
tgtttgggta atgtggggat gaactgcatg 1740 
ctgcggtcac tgtcggcctt gcaaggccac 1800 
ggagcaaggc ttcaggaagg gccagccaca 1860 
tggaattaaa aagtcaaacc tttatatgca 1920 
agtttaaaag aatttttttt tctcttcaag 1980 
cccttgtttg agtgtcttat ccctgagcaa 2040 
acttttgttt caacctttat tcctgtaaat 2100 
agatttgaca aaaaaaaatt gaattaaaaa 2160 
aagcctttcc tattatttct cttcttagct 2220 
aacacagctt tctagcagct gcaaaatggt 2280 



6/20 



wo 00/21986 



PCT/US99/23315 



ttaatgcccc 
tatgatccca 
tggtttgtgc 
ctaatcaaag 
tttaaaaata 
aagctctgga 
ttcagcagat 
gcttgaatta 
gtaatcactt 
tttgtttgac 
atgtttccca 
caaaatggtg 
tctgttatgt 



ctacatattt 
gaaaacatct 
attttctcaa 
acactatttt 
aattgtgttt 
atccctttat 
tttgcccact 
gatccctgca 
catgaacgct 
taattctgga 
aactgtgagg 
ctttgagggt 
gcctatccta 



ccatcacctt 
gtctctactt 
ctaaaaatag 
catactagat 
tggtctgttc 
tgtgctgttg 
attcctctga 
aaggcttgct 
aaatgagaat 
attacaagat 
agggaaggct 
cagcctttag 
ataaactctt 



gaacaatagc 
cggctgcaaa 
agatgataat 
tcctgagaca 
ttgtagataa 
ctcttatctg 
gctgaagttc 
ctgtgatgtc 
gtaagtattt 
ttctatgcag 
cagagatcga 
gaaggtgcag 
aaacacaaaa 



tttagcttgg 
acccatggtt 
ccgaattctc 
aatactcact 
tgcccttcta 
caaggtggca 
tttgcataga 
agatgtaatt 
ttaaatgtgt 
gatttacctt 
gcttctcctc 
ctttgttgtc 
aaaaaaa 



gaatctgaga 
taaatctata 
catatattca 
gaagggcttg 
ttttaggtag 
agcagttctt 
tttggcttaa 
gtaaatgtca 
gtatttcaaa 
catcctgtgc 
tgagttctaa 
ctttgagctt 



2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3047 



<210> 8 

<211> 3017 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No.: 1662318CB1 



<400> 8 

cgcaaactca 

aatcaggtcc 

cagagagtga 

gacctcatca 

gccatcggcg 

atggtggaga 

gtggtgggca 

ctggtgcccg 

cagggcctgc 

ctggtgcgag 

gggaaggaga 

cagctggaac 

ctgatggcgc 

gacgagatgc 

gagagcaccg 

gtggagcggg 

gagtgggtgg 

gccgacggca 

aactcagtgc 

gatgaagagt 

cccgccaacc 

gagtgagccg 

cgcacacccc 

ccgggggtct 

cgggagggac 

acatcacaca 

attgagcacc 

aaatagacaa 

agtcacgcaa 

ccctgtggct 

cagccaggca 

ctcagagggg 

gctcggggaa 

ttctttcctt 

tctttccttc 

tcctgtcctg 

tgcagtggtg 

gcctcagcct 



accctttcgg 
tggagagcat 
gccgcggcta 
tcctgctctt 
cgttgcgggg 
cgcagcagct 
cgcccgaggt 
acaaccggcg 
cccggcacgc 
ttcacgctta 
acaagaagaa 
atcacatctc 
acgacttcac 
tgacgcacga 
aggtgggcgt 
gacctgacga 
tgaccaagga 
agctgagcgg 
tggggcgcat 
tcgcgctggc 
tgccccgtcg 
ggcccccctc 
tgctccggct 
ccctcctcac 
aaggcttctc 
cacactggca 
tactatgtgc 
atacatctgc 
acacacacta 
gaaatgacta 
acaccctcaa 
agacacacct 
agcccccaat 
ccttccttct 
cttccttctt 
tcctttcttt 
agatctcagc 
cctgagtagc 



aaacaccttc 
cagcatcatc 
cgacttcccg 
tgatgcgcac 
ccatgaggac 
gatgcgcgtc 
gctgcgcgtc 
cctcttcgag 
agccttgcgc 
catcatcagc 
gcagctgatc 
ccctggggac 
caagtttcac 
catcgccaag 
gcaggggggc 
ggccatggag 
caagtccaaa 
ctccaaggcc 
ctggaagctc 
cagccacctc 
cctggtgcca 
ccatggccct 
cacacacgcc 
taccgccaga 
tgtccgccct 
cacgcaggca 
ccagccctgt 
cctcatggaa 
attcctggca 
gcagataaac 
ccggctccat 
actgcttcct 
tctgcccaca 
tttttgtttt 
ttttgttttt 
cttttttgat 
tcactgcaac 
tgggactgca 



ctcaacaggt 
gacaccccgg 
gccgtgctgc 
aagctggaga 
aagatccgcg 
tacggcgcgc 
tacatcggct 
ctggaggagc 
aagctcaacg 
tacctgaaga 
ctcaaactgc 
tttcctgatt 
tcgctgaagc 
ctcatgcccc 
gcttttgagg 
gacggcgagg 
tacgacgaga 
aagacctgga 
agcgatgtgg 
atcgaggcca 
ccctccaagc 
gctgtggctc 
ctgcctgccc 
caccccggtg 
tcacacctcc 
tccatccatc 
tctaggcact 
ggtgacgttc 
gggcccccag 
agaccccctt 
cacatcctca 
cagatgggcc 
cccatttatt 
tgcccccaat 
gcccccagtt 
agaatcttgc 
ctccacctcc 
ggcacgcgcc 



tcatgtgtgc 
gtatcctgtc 
gctggttcgc 
tctcggacga 
tggtgctcaa 
tcatgtgggc 
ccttctggtc 
aggacctctt 
acctggtgaa 
aggagatgcc 
ccgtcatctt 
gccagaaaat 
cgaagctgct 
tgctgcggca 
gcacccacat 
agggctcgga 
tcttctacaa 
tggtggggac 
accgcgacgg 
agctggaagg 
gacgccacaa 
cccagctcca 
tccctgccca 
gaagcattta 
agcctcacgt 
cgtcattcat 
gggcattacc 
ccaggagagg 
cccctcccct 
ctgctccgct 
ggtctcggga 
cctccgcagc 
tccttccttc 
tctgcccata 
ctgtccacac 
tctgtcgccc 
tgggttgaag 
accacgccca 



ccagctccct 
gggtgccaag 
ggagcgcgtg 
gttctcagag 
caaggccgac 
gctgggcaag 
ccagcccctc 
ccgcgacatc 
gagggcccgg 
ctctgtgttt 
tgcgaagatt 
gcaggagctg 
ggaggcactg 
ggaggagctg 
gggcccgttt 
cgacgaggcc 
cctggcgcct 
caagctcccc 
catgctggat 
ccacgggctg 
gggctccgcc 
gtcggctgca 
gctgtaagga 
gaggggacca 
tcacttaggc 
tcaaatattt 
atagagaaca 
gcacctacac 
ggctgagcag 
tcctcctgcc 
ccatgggggg 
cccttccctt 
cttccttctt 
cccatttctt 
cccttccctt 
aggctgggag 
tgattctcgt 
gctaattttt 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 



7/20 



wo 00/21986 




PCTAJS99/23315 



gtatttgagt 
caggtgatct 
cccggcttca 
tcctgctcct 
tggctggaac 
ggacttaagg 
gcccctacgt 
cgggggaggg 
gtttgctgcc 
ggggaagagc 
gcccaccact 
cagagcgaat 
ccgaagttat 



agagacgggg 
gctcgcctcg 
cacccatttc 
ctgatactgt 
tgcccagcct 
attgctgggc 
agaaaggccc 
ggttcttggt 
ttcaccacat 
aaaatacatg 
gtcccccacc 
aaagcaaggc 
tcccttc 



tttcaccatg 
gcctcccaaa 
tttaaaaagg 
gcccccttgg 
gctcctggcc 
caccgcctct 
cggggcttta 
gctacagccc 
attagtgctt 
gagacgacgc 
ccatggctgg 
ttcttcccca 



ttggccaggc 
gtgatgggat 
atcccgtagc 
agatatttcc 
ccctggaagc 
ctgcctacca 
ttttagtctc 
tctccccacc 
gaccctggca 
accctccagg 
gaggggcctc 
aaaaaaaaaa 



tggtctcgaa 
tacaggcatg 
aggcagaaaa 
gtcctccacc 
ctccccacag 
ccattccata 
cttttcaggg 
cctaaaggga 
ggggacccca 
atgctcgctg 
tgaacggaac 
aaaaaaaaaa 



ctccgcatGt 
agccaccgtg 
gccccttcca 
cacgtgtctg 
ctggtaatct 
tttaagtgga 
atgtcgtggg 
cgccgacgct 
tggaaaagat 
ggattcccac 
agtgtcccca 
attggtgcgg 



^J4U 

2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3017 



<210> 9 

<211> 1735 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_f eature 

<223> Incyte ID No.: 1996726CB1 



<400> 9 

tcgggaggaa ggagactaca cctgctttgc 
agtcagagtc aaggtggtga cagcgcccgc 
tcaggtgccc tatggagacg tggtcactgt 
caaggtgact tggttgtccc caaccaacaa 
gatataccaa gatggcactc tccttattca 
cacctgcttg gtcaggaaca gcgcgggaga 
cgtccagcca cccaagatca acggtaaccc 
agccgggggc agtcggaaac tgattgactg 
gttatgggct tttcccgagg gtgtggttct 
tgtccatggc aacggttccc tggacatcag 
ggtatgcatg gcacgcaacg agggagggga 
ggagcccatg gagaaaccca tcttccacga 
gggccacacc atcagcctca actgctctgc 
ggtccttccc aatggcaccg atctgcagag 
ggctgacggc atgctacaca ttagcggtct 
cgtggcccgc aatgccgctg gccacacgga 
gccagaagca aacaagcagt atcataacct 
gctcccctgc acccctcccg gggctgggca 
catgcatctg gagggccccc aaaccctggg 
cacggttcgt gaggcctcgg tgtttgacag 
atacggccct tcggtcacca gcatccccgt 
cagcgagccc accccggtca tctacacccg 
ggctatgggg attcccaaag ctgacatcac 
ggcaggggtt caggctcgtc tgtatggaaa 
catccagcat gccacacaga gagatgccgg 
cggcagtgac tccaaaacaa cttacatcca 
tgcttaggaa ctgacaacaa agcggggttt 
tcttaaataa tgtgtcacag tgcatggtgg 
ttgatctaca attgttggga aaaggaagca 



tgaaaatcag gtcgggaagg acgagatgag 60 
caccatccgg aacaagactt acttggcggt 120 
agcctgtgag gccaaaggag aacccatgcc 180 
ggtgatcccc acctcctctg agaagtatca 240 
gaaagcccag cgttctgaca gcggcaacta 300 
ggataggaag acggtgtgga ttcacgtcaa 360 
caaccccatc accaccgtgc gggagatagc 420 
caaagctgaa ggcatcccca ccccgagggt 480 
gccagctcca tactatggaa accggatcac 54 0 
gagtttgagg aagagcgact ccgtccagct 600 
ggccaggttg atcgtgcagc tcactgtcct 660 
cccgatcagc gagaagatca cggccatggc 720 
cgcggggacc ccgacaccca gcctggtgtg 780 
tggacagcag ctgcagcgct tctaccacaa 840 
ctcctcggtg gacgccgggg cctaccgctg 900 
gaggctggtc tccctgaagg tgggactgaa 960 
ggtcagcatc atcaatggtg agaccctgaa 1020 
gggacgtttc tcctggacgc tccccaatgg 1080 
acgcgtttct cttctggaca atggcaccct 1140 
gggtacctat gtatgcagga tggagacgga 1200 
gattgtgatc gcctatcctc cccggatcac 1260 
gcccgggaac accgtgaaac tgaactgcat 1320 
gtgggagtta ccggataagt cgcatctgaa 1380 
cagatttctt cacccccagg gatcactgac 1440 
cttctacaag tgcatggcaa aaaacattct 1500 
cgtcttctga aatgtggatt ccagaatgat 1560 
gtaagggaag ccaggttggg gaataggagc 1620 
cctctggtgg gtttcaagtt gaggttgatc 1680 
atgcagacac gagaaggagg gctca 1735 



<210> 10 

<211> 1016 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc feature 



8/20 



wo 00/21986 

<223> Incyte ID No.: 2137155CB1 



PCT/US99/23315 



<400> 10 

ctgtacgttc 

atgggtcacc 

ctggactcca 

ttcctattca 

actaagcctt 

gtccaggtgg 

ggcttctaca 

tgaagaaagg 

aagtaaacta 

agcgctaaga 

acttttcctc 

gatatatttg 

ggcgaaatac 

aaccctggta 

agactgcact 

ccccgatgcc 

aggtcttaag 



ccctgtggcc 
tccaggtaga 
ttgcctcagt 
tcaatcagaa 
cttccttaaa 
acagttccca 
gcatgcaaaa 
caactaggat 
gaatttgtgc 
ccttactggg 
aagataactg 
cctgtaagat 
accgcacggt 
cactaaagca 
ggttgctgca 
ataacacctt 
cccaagtatc 



cacgcctagt 
ttacagagat 
tgtggttccc 
gaaacagtgg 
taatcagcta 
gagaatgcta 
acagaaccat 
gaggtttcaa 
acttgcttag 
atgggctctg 
accaagtgtt 
agctgtagag 
ggtgttggga 
gttcagtgtg 
aactcaggcc 
tggaatcccg 
tttctataca 



gaaaatgata 
aacaggctgc 
ataattatat 
ataccactgc 
gtatctgtgg 
agaattgcag 
ctacaggcag 
aagacggaag 
tggattgtat 
tctacagcaa 
tcttagaacc 
atatttgggg 
agaaaaattt 
ccagaggtta 
tgaatgagcg 
agcggccctc 
gtcccactgc 



tcgtacatct 
acccaagtga 
gcctctctat 
tttgctggta 
actgcaagaa 
aaccagatgc 
acaatttcta 
acgactaaat 
tggattgtga 
tgtgcagaac 
aaagttttta 
tggggacagt 
gtcagcttgg 
tttttttccc 
gaaacaaaaa 
agaaaccttt 
ggtgagcgtg 



ccctagagat 


ou 


agatitzcttca 




tataatagca 


1 on 

J.OU 


t cgaacacca 




aggaaccaga 


oUU 


aagattcagt 


o bU 


ccaaacagtig 


A or\ 
4 zu 


C ugCuC^aoo 


H 0\J 


cttgatgtac 


540 


aagcattccc 


600 


aagttgctaa 


660 


gagtttggat 


720 


ctcggggaga 


780 


attgctctga 


840 


aagccttgcg 


900 


tcaggcatcc 


960 


ggggag 


1016 



<210> 11 

<211> 2288 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> inisc_feature 
<223> Incyte ID No. 



2268890CB1 



<400> 11 

caaccagggt 

agttgggagt 

ctgccgagcg 

cagccccgca 

gcagagccag 

gaacccctcc 

gagccgcccc 

aaccaccatt 

ctggctgcca 

tcgccaagag 

tgcacctaca 

aaggagcctg 

aatgagctgc 

ggcggcattg 

gtcacgcagc 

gagctctccc 

agcaagtaca 

tcagagatca 

ccccagccac 

atcaaccaga 

cctctgccca 

ccatggagag 

gtgaagccgg 

gggggctgga 

gagacgtaca 

atttactggc 

ggccgcaaag 

aagctgcggc 

aagcagttca 

cagaagggag 

cgcgggggcc 



caggctgtgc 
tcaaatgagg 
tggcactgag 
ggacccctgg 
tggagcccag 
agaggccatg 
ggagccaagc 
ttgcaaggac 
tgggagctgt 
agttcattta 
ccttcattgt 
aggtgcttct 
tcaagcagaa 
tgagcgaggt 
tctacatgca 
agctggagaa 
aggacctgga 
tcgcgcagct 
cccccgctgc 
tctctaccaa 
ctatgcccac 
actgcctgca 
agaacaccaa 
ccgtcatcca 
agcaagggtt 
tgacgaacca 
tctttgcaga 
tggggcgcta 
ccaccctgga 
gctggtggta 
attaccggag 



tcacagtttc 
ctgctgcgga 
gcagcggctg 
ccagccctgg 
tgaggcaggg 
gacaggctgc 
aggagggaag 
catgaggcca 
tgcaggccag 
cctaaacagg 
gccccagcag 
ggagaaccga 
gcggcagatc 
gaagctgctg 
gctcctgcac 
caggatcctg 
gcacaagtac 
tgaggagcac 
cccgccccgg 
cgagatccag 
tctcaccagc 
ggccctggag 
ccgcctcatg 
gagacgcctg 
tgggaacatt 
aggcaactac 
atacgccagt 
ccatggcaat 
cagagatcat 
taacgcctgt 
ccgctaccag 



ctctggcggc 
cggcctgagg 
acgctactgt 
ccccagcctc 
ctgcttggca 
cccgctgacg 
aggctttcat 
ctgtgcgtga 
gaggacggtt 
tacaagcggg 
cgggtcacgg 
gtgcataagc 
gagacgctgc 
cgcaaggaga 
gagatcatcc 
aaccagacag 
cagcacctgg 
tgccagaggg 
gtctaccaac 
agtgaccaga 
ctcccatctt 
gatggccacg 
caggtgtggt 
gatggctctg 
gatggcgaat 
aaactcctgg 
ttccgcctgg 
gcgggtgact 
gatgtctaca 
gcccactcca 
gacggagtct 



atgtaaaggc 
atggacccca 
gagggaaaga 
tgccggagcc 
gccaccggcc 
gccagggtga 
agattctatt 
catgctggtg 
ttgagggcac 
cgggcgagtc 
gtgccatctg 
aggagctaga 
agcagctggt 
gccgcaacat 
gcaagcggga 
ccgacatgct 
ccacactggc 
tgccctcggc 
cacccaccta 
acctgaaggt 
ccaccgacaa 
acaccagctc 
gcgaccagag 
ttaacttctt 
actggctggg 
tgaccatgga 
aacctgagag 
cctttacatg 
caggaaactg 
acctcaacgg 
actgggctga 



tccacaaagg 
agccctggac 
aggttgtgag 
ctctgtggag 
tgcaactcag 
agcatgtgag 
cacaaagaat 
gctcggactg 
tgaggagggc 
ccaggacaag 
cgtcaactcc 
gctgctcaac 
ggaggtggac 
gaactcgcgg 
caacgcgttg 
gcagctggcc 
ccacaaccaa 
caggcccgtc 
caaccgcatc 
gctgccaccc 
gccgtcgggc 
catctacctg 
acacgacccc 
caggaactgg 
cctggagaac 
ggactggtcc 
cgagtattat 
gcacaacggc 
tgcccactac 
ggtctggtac 
gttccgagga 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 



9/20 



wo 00/21986 

ggctcttact cactcaagaa 
taagccagct ccccctcctg 
tggccacagc acaaagaaca 
gctggattct gttttccgaa 
ttctgtccct cctactttcc 
gactacagac aactctttct 
agtaccttca taatatacat 
atatatgg 



agtggtgatg 
acctctcgtg 
actcctcacc 
gtcactgcag 
ttcacaccag 
ttaaataaat 
gtgtatgagc 



atgatccgac 
gccattgcca 
agttcatcct 
cggatgatgg 
acagcccctc 
taagtctcta 
ctcccttgtg 



cgaaccccaa 
ggagcccacc 
gaggctggga 
aactgaatcg 
atgtctccag 
caataaaaac 
cacgtatgtg 



PCT/US99/23315 

caccttccac 
ctggtcacgc 1980 
ggaccgggat 2040 
atacggtgtt 2100 
gacaggacag 2160 
acaactgcaa 2220 
tatagcacat 2280 
2288 



<210> 12 
<211> 3304 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No.: 2305981CB1 

<400> 12 

ccctcttatg gattcccagc aagcatcagg 
acacaagcat ggacaagtgt gtgtttccaa 
gcacccaaac ctccgggcat ttggcattgt 
caagcaagag tgtaagaaaa tccactgccc 
aatagacgga aagtgctgca aggtgtgtcc 
ccaaagcttt gacaataaag gctacttctg 
tgtattcatg gaggatgggg agacaaccag 
tcaggtagag gtccacgttt ggactattcg 
gaagatctcc aagaggatgt ttgaggagct 
cctgagccag tggaagatct tcaccgaagg 
tcgtgtatgc agaacagagc ttgaagattt 
aaagggccac tgttaggcaa gacagacagt 
ctgcagctgg actgcaggct tattttgctt 
aaatgcagtc aattattcac gccatgcaca 
tgtcagccct tgaacatctc ctccaaagag 
gaggagggat agaacatcac aacactgctc 
ggttaaagac aaacaagacc ccagggtttt 
agaagggaat tgcttagtag gagttctgca 
cctttgaatt ttagaatgtc atgtgttctt 
tcactccctc cctccctcct tctctctctc 
acacacacac acacacgcac acgcacgtcc 
agcaaagcta gccaaaattc tacgttactt 
agtttttgtg cccaggagag taaataactg 
tggctgttta agtcaccaac aatagagtca 
cattcattca cttagaagtg gtaataattt 
ctgtacctat gggacttcca gaaagaagtt 
catgtaagaa aaaataattg ttgaagaaag 
ttgctttcac atcaataaaa tttaccaagt 
accatagttg tctggtcaga aaaattatat 
agggaagttt tccttcttct ccaattatag 
gtcctcatga gcatctgcat gttgactctt 
ggtggatatt ctgatgaaga tctttatcct 
caagcagata ttttagtcaa gaattccaga 
cccaatacca gagcataaac tatccattct 
gaagacctaa ttcttcacag caaggatctc 
ggggcaggaa tgaactgtag aaatgtttta 
atgactatag gtgagagaat tctttcctaa 
aaatgttcag tctttatgac aacctggcat 
agggccttat ggccagggtt tcttgggaca 
ccttggaaga gagaagcagt acatcccggt 
gtgtgtgtgt gtgtgtgtgt gtgtgtgtgt 
ttatgcggct gctccctccg tcccagaggt 
actagatcct aaggcaaaga ggtgtttctc 



aaccattgtg caaattgtca tcaataacaa 60 
tggaaagacc tattctcatg gcgagtcctg 120 
ggagtgtgtg ctatgtactt gtaatgtcac 180 
caatcgatac ccctgcaagt atcctcaaaa 240 
aggtaaaaaa gcaaaagaag aacttccagg 300 
cggggaagaa acgatgcctg tgtatgagtc 360 
aaaaatagca ctggagactg agagaccacc 420 
aaagggcatt ctccagcact tccatattga 480 
tcctcacttc aagctggtga ccagaacaac 540 
agaagctcag atcagccaga tgtgttcaag 600 
agtcaaggtt ttgtacctgg agagatctga 660 
attggatagg gtaaagcaag aaaactcaag 720 
aagtcaacag tgccctaaaa ctccaaactc 780 
gcataatttg ctcctttgtg tggagtggtg 840 
actagaagag tcttaaatta tatgtgggag 900 
tagtttcttg gagaatcaca tttctttaca 960 
tatctagaaa gttattcaag tgaaagaaag 1020 
gtatagaaca attacttgta tgaaattata 1080 
ttaaaaaaat tagctcccca tcctccctcc 1140 
tctctctctc cctctctcac agacacacac 1200 
acactcacat taaactaaag ctttatttga 1260 
ttcccttgac tggatcccaa gtagcttgga 1320 
tgaacaagag gctctgccct taggtctttg 1380 
gggtaaagaa taaaaacact ttcatagcct 14 40 
ttccctaatg ataccacttt tcttttcccc 1500 
aaattgagta aaatcatcag aaactgaatc 1560 
aagttgatag aattcaaaaa ggccatcttt 1620 
aatagatcag tactcactaa tatttttgag 1680 
taaattagta aattctagaa gctctttaaa 1740 
gagttgattt ttactttgca aagtggctcg 1800 
cagttaagaa aattgttgtt catttaggga 1860 
aaaccttcct actatccttg tcttattcat 1920 
gaaggctgct cctaaaatgt ctacttgcag 1980 
ggggtctggc tttagaaatc atctttgtgg 2040 
aggcatgcct tctagatttg ttccctctga 2100 
aggacccaga aaccccatat gtctcattcc 2160 
gagggtttga taccaatagg ggaaaatgta 2220 
aaaggagtca attcttatga aagagacaca 2280 
agactctcac cagcacatca cacacgttct 2340 
tgagaggtca caaagcatta gtttgtgtgt 2400 
gtgtgtgtgt gtggtaaagg ggggaaggtg 24 60 
ggcagtgatt ccataatgtg gagactagta 2520 
cttctggatg attcatccca aagccttccc 2580 



10/20 



wo 00/21986 



PCT/US99/23315 



acccaggtgt 
actcctgcct 
ggagctgtgt 
catagggtaa 
tgtgtattta 
gttgtagcca 
ctttcactgt 
tttctggatt 
aaaaaaaaaa 
cgcccttaag 
ccttagttgc 
gccttcccaa 
ggta 



tctctgaaag 
ccaggtgctg 
taagtcaaag 
agaggccaag 
tttgtatcat 
ttttctagtt 
tctcacagga 
tttaaattaa 
aaaaaaaaaa 
attccctggc 
tcgctaaaat 
tagcctccca 



cttagcctta 
ggacacacct 
tagaaaccct 
ctgcctgtag 
aaacacttgg 
aactcatgta 
catgtaccta 
taaaaaagtt 
actcgagggg 
cgcagttttt 
cccctttcgc 
tgaatgggaa 



agagaacacg 
ttgcaaaatg 
ccagtgtttg 
ttagtagaga 
aacaacaaag 
aacaagtaag 
attatggtac 
aattttgaaa 
gggcctgtac 
ggccgcgttt 
agcccgttta 
tggaattgga 



cagagagttt 
ctgtgggaag 
gtgttgtgta 
agaatggatg 
accataagca 
agtaacataa 
ttatttatgt 
aaaaaaaaaa 
cgggttcccc 
tggggaacct 
aaggctgggg 
agggaaattt 



ccctagatat 
caggagctgg 
gagaatagga 
tggttcttct 
tcatttagca 
cagtattacc 
agtcactgta 
aaaaaaaaaa 
gtaacaggtt 
ctgggtaccc 
ccggccgatt 
tggtaaatcc 



2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3304 



<210> 13 
<211> 708 
<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_f eature 

<223> Incyte ID No.: 2457612CB1 



<400> 13 

ggaaagccag 

tcaacaccca 

acagttcctg 

agagaaactg 

aagcctgaca 

gccacagagg 

accgtggaag 

tgtcactgaa 

cattcaaatg 

tgaattccag 

attcagtact 

tggactgaag 



gaagtgcagg 
ggcctactgg 
cctctggaga 
atcctcttgg 
acagtccctg 
ggaatgccac 
ggtgcccctt 
tatgaagtta 
acaaatcaga 
gtgaaaccca 
gaatcagcgg 
gccgctttgt 



aatcatttca 
aactcccttg 
agaactggaa 
gaagccaaga 
ctccattact 
cagcccacca 
catttgtcat 
tatccagaga 
cattttccac 
aaaacccgct 
acccagagtg 
tcgactcttg 



tcagggccaa 
gagagaatag 
aatataactg 
ttcaaaggac 
gactctgtca 
cagaacccac 
cttggactgg 
aaatgggtca 
agtagaaaat 
tggtgaaggc 
agtgagcagt 
ctcaggtgta 



taactacacc 
agacagatgt 
actttagctc 
ctcatgtgcg 
aacggttccc 
ccaccaacct 
gaaaagccac 
ttcagtggga 
ctgaaaccaa 
ccggtcagca 
ttctgcagga 
agggcaac 



acccctgagg 
aaagcaacca 
aagcccaaca 
atacatccaa 
caaagaggag 
cactgtggtc 
taaatgacac 
agaacaagtc 
acacgagtta 
acacagtggc 
gagatgcctc 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
708 



<210> 14 

<211> 2040 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc__feature 

<223> Incyte ID No.: 2814981CB1 



<400> 14 

cggccagccg ccgcgcgctg cagctctccg ggacgcccgt gcgccagctg cagaagggcg 60 

cctgcccgtt gggtctccac cagctgagca gcccgcgcta caagttcaac ttcattgctg 120 

acgtggtgga gaagatcgca ccagccgtgg tccacataga gctcttcctg agacacccgc 180 

tgtttggccg caacgtgccc ctgtccagcg gttctggctt catcatgtca gaggccggcc 240 

tgatcatcac caatgcccac gtggtgtcca gcaacagtgc tgccccgggc aggcagcagc 300 

tcaaggtgca gctacagaat ggggactcct atgaggccac catcaaagac atcgacaaga 360 

agtcggacat tgccaccatc aagatccatc ccaagaaaaa gctccctgtg ttgttgctgg 420 

gtcactcggc cgacctgcgg cctggggagt ttgtggtggc catcggcagt cccttcgccc 480 

tacagaacac agtgacaacg ggcatcgtca gcactgccca gcgggagggc agggagctgg 540 

gcctccggga ctccgacatg gactacatcc agacggatgc catcatcaac tacgggaact 600 

ccgggggacc actggtgaac ctggatggcg aggtcattgg catcaacacg ctcaaggtca 660 

cggctggcat ctcctttgcc atcccctcag accgcatcac acggttcctc acagagttcc 720 

aagacaagca gatcaaagac tggaagaagc gcttcatcgg catacggatg cggacgatca 780 



11/20 



wo 00/21986 



PCT/US99/23315 



caccaagcct 
gaatttatgt 
gtgacatcat 
ccgtgctgac 
tcagcatcgc 
gcctgcagac 
ctcagcaggg 
aggggcccga 
cccaacatcc 
aaaactgcct 
acccatctgc 
cttcccccct 
taccaagctg 
catctgatcc 
ccccctggct 
gaggccgcgg 
gcggccatgg 
gccggcttcc 
ccagaggcat 
gggcatttgt 
tactgtatgg 



ggtggatgag 
gcaagaggtt 
cgtcaaggtc 
cgagtctcct 
acctgaggtg 
aacggagggc 
cggcagcctc 
atttccgcct 
ccttgtacag 
tccatggagg 
agtatcccct 
gacaaacgcc 
tagggccagg 
ctttggggtg 
gcggagctga 
ggagcacgtg 
ggcagcctgc 
ccttcccacg 
gcaggctgct 
gagctttgct 
aaaataaagt 



ctgaaggcca 
gcgccgaatt 
aacgggcgtc 
ctcctactgg 
gtcatgtgag 
agcgcccccc 
ctcctggctg 
ggggagtgtt 
atgatcctga 
tcccctcctc 
gctcctgccc 
cacctgacct 
gctgctgcct 
cgggggtggg 
gccccgccct 
gaaagttggc 
agaggacagt 
cagctctggg 
gggcaccacc 
gtaaatggat 
ttacaagcac 



gcaacccgga 
caccttctca 
ctctagtgga 
aggtgcggcg 
gggcgcattc 
cgagatcagg 
tccggggcag 
ggatccacat 
aagtcacttc 
tcctagcttc 
ctcctactgc 
gaggccccag 
gccagcctgg 
gtccagccca 
gccatgaggt 
tgctgcctgg 
ggacgtggag 
atgcagcagc 
ccctcatcca 
tcccagtgtt 
aaaaaaaaaa 



cttcccagag 
gagaggcggc 
ctcgagtgag 
ggggaacgac 
ctccagcgcc 
acgaaggacc 
agcggaggct 
cccggtgccg 
caagttctcc 
ccgcctctgc 
aggtctgggc 
cttccctctg 
ggtccctgga 
gagcaggcac 
tttcctcccc 
ggaagcttct 
ctgcggggtg 
cgctcgcatg 
gggaacgagt 
gcttgtactg 
aaaaaaaaaa 



gtcagcagtg 
atccaagatg 
ctgcaggagg 
gacctcctct 
aagcgtcaga 
accgtcggtc 
gggcttggcc 

ggatattcac 
ccctgtgaac 
tgccaagctt 
ccctaggact 
ggacaggtca 
tgagtgaatg 
aggcaggcag 
cctccccaag 
tgaggactga 
gaagtgccgc 
gtgtctcaag 
tatgtttctc 
aaaaaaaagg 



840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 



<210> 15 

<211> 2121 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No.: 3089150CB1 

<400> 15 

gtaaaagctg gttgtgatcg catcatagac 
tgcgggggaa atggatctac ttgtaaaaaa 
ggatatcatg atatcatcac aattccaact 
aaccagaggg gatccaggaa caatggcagc 
tatattctta atggtgacta cactttgtcc 
gttgtcttga ggtacagcgg ctcctctgcg 
ctcaaagagc ccttgaccat ccaggttctt 
aaatacacct acttcgtaaa gaagaagaag 
gcatgggtca ttgaagagtg gggcgaatgt 
agactggtag aatgccgaga cattaatgga 
aagccagcca gcaccagacc ttgtgcagac 
tggtcatcat gttctaagac ctgtgggaag 
tcccatgatg gaggggtgtt atctcatgag 
ttcatagact tttgcacaat ggcagaatgc 
agggcaaggc aaagtgagga agggctggtg 
gcgtatcttg ccagtaacca gtgaggtgta 
aaaaggagtt gaatcatcag agtaaactgc 
ggattattaa cctctgagca gtgatatagc 
tttcttttgt tacatctatt acaagtttag 
gaactattac aacccctgtt tcctggtact 
aaatgaaaag taggagaaaa gtgagatttt 
caatgggggg agaaaggagt acaaatagga 
ggtttcagag aatgtttata cattatttct 
atgagagaaa ggctcagcaa cgtgaaataa 
ccatctcagt ctttatttgt gtaattcatt 
caagtgcatt aaagtctaca atggaaaaaa 
tagaggagac acaatgagct tagtacctcc 
gctttgggaa tatggatgta aagaagtaac 
caaggaggat gaaacgccgg aacaaaaatg 
gggacattga gatcacttgt cttgtggtgg 



tccaaaaaga agtttgataa atgtggtgtt 60 
atatcaggat cagttactag tgcaaaacct 120 
ggagccacca acatcgaagt gaaacagcgg 180 
tttcttgcca tcaaagctgc tgatggcaca 240 
accttagagc aagacattat gtacaaaggt 300 
gcattggaaa gaattcgcag ctttagccct 360 
actgtgggca atgcccttcg acctaaaatt 420 
gaatctttca atgctatccc cactttttca 480 
tctaagtcat gtgaattggg ttggcagaga 540 
cagcctgctt ccgagtgtgc aaaggaagtg 600 
catccctgcc cccagtggca gctgggggag 660 
ggttacaaaa aaagaagctt gaagtgtctg 720 
agctgtgatc ctttaaagaa acctaaacat 780 
agttaagtgg tttaagtggt gttagctttg 840 
cagggaaagc aagaaggctg gagggatcca 900 
tcagtaaggt gggattatgg gggtagatag 960 
cagttgcaaa tttgatagga tagttagtga 1020 
ataataaagc cccgggcatt attattatta 1080 
aaaaaacaaa gcaattgtca aaaaaagtta 1140 
tatcaaatac ttagtatcat gggggttggg 1200 
actaagacct gttttacttt acctcactaa 1260 
tctttgacca gcactgttta tggctgctat 1320 
accgagaatt aaaacttcag attgttcaac 1380 
cgcaaatggc ttcctctttc cttttttgga 1440 
ttgaggaaaa aacaactcca tgtatttatt 1500 
agcagtgaag cattagatgc tggtaaaagc 1560 
aacttccttt ctttcctacc atgtaaccct 1620 
ttgtgtctca tgaaaatcag tacaatcaca 1680 
aggtgtgtag aacagggtcc cacaggtttg 1740 
ggaggctgct gaggggtagc aggtccatct 1800 



12/20 



wo 00/21986 

ccagcagctg gtccaacagt cgtatcctgg tgaatgtctg 
atgatttttt ccatatgtat atagtaaaat atgttactat 
gtattggttt gggtgttcct tccaagaagg actatagtta 
catatttatt tttatacatt tatttctaat gaaaaaaact 
tggaagtgca tataaaatag agtatttata caatatatgt 
ttttggaaaa aaaaaaaaaa a 



ttcagctctt 
aaattacatg 
gtaataaatg 
tttaaattat 
tactagaaat 



PCT/US99/23315 

ctgtgagaat 18 60 
tactttataa 1920 
cctataataa 1980 
atcgcttttg 2040 
aaaagaacac 2100 
2121 



<210> 16 

<211> 2900 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> inisc_f eature 

<223> Incyte ID No. : 3206667CB1 

<400> 16 

gaagttttaa aaaaaactac agcagccaaa 
aatgattgcc tctactgtcc tcattgactt 
ttcccagttc tctttataga agctctagga 
atctgtgctg gacagacata attccctttc 
aggtttttcc atcagcctct gaaaaaatag 
tgatccaatg taattggctg cgtctggcta 
atagatttaa gcttgaagct acagattata 
aatcagacag tcagtcatct taagttaaag 
acagactatt tggtagaaga aatatttttc 
aacacaaaga gtgagttcta aaaggcatgc 
agatgcacac ttatatatat actgtatata 
atatataata ttgcaagctt aagtttgcaa 
caccctcacc actgttctta tctctatagt 
tttctttttc tacacgaagt tttcattaaa 
gagaacaggt cccattttca cattagggct 
aaaaatgtcc ttgagtttgg agcctgagct 
atgggaattg cagttagaga gagtaaggaa 
tagcaggaat atgaaagaaa ggcacatgtt 
tccaacattt aaaaggcaat tgtgggctat 
tagtgtctag ggctgggagc caggactgat 
tgcttttgta acttgccagg tggacttgac 
acccatttgt aaaatgggat taataatact 
tattcatttg ctcctttatt ctttcctgta 
gggaggaaag ggactataaa agtgtacaat 
ttgttttcta gtaagaaaat gctaccttgc 
tgagaaatag gtttatattt tcagatctct 
tttaagacac atagaacaga tttttttaat 
aataattttt agttgtgagt gattaaaaaa 
ctttgtagtc tgagtgacag gcaaggattt 
ttgtatttcc cttggcatat cagattgagc 
accaatctga aattgtattt caaatgttga 
catcttatac attttgcttt caccaattga 
atttcttact gaatggttca tgtaggcttg 
gagttccccc attcatccat ttgtcccatt 
actgtacatc ccaacagact gaaacattct 
tgaactttgg aggtttggag cttgaagaga 
gaacagaaat catacatgaa aaggttttac 
atgtgaaaca aaatcatttg aaattttgat 
aagcctgaac ccataaaccc aaatgatagg 
agcaagcaat gtctgggaat atcatagagt 
actgctggtg aaccaatacc ataagcatgt 
atgttgtaca agctctcaat tttgttcatt 
tatgtgattt ggaaaagatg ccttctggat 
ggctgcaaat gtcaagacat aaccctgttc 
tacataggaa ctatctgcct gtgtcctcaa 



gaaactatat atatatatat atatatccag 60 
gtttgaacct tagtgcctta ccctgtcctc 120 
gctttcgaaa agccaaagtc tttctgaaga 180 
tcattgtctc catctttgtt ggtcatggta 240 
ttgtgcacaa catctgctca ctggactgtc 300 
attctaagca ctaaagtcta catctaagct 360 
tcactatcac caccacccct cacccagtga 420 
atatttgttg tctttgaatg atttgctgtc 4 80 
acctgagaga ggaagagaaa tttctctagt 54 0 
ccacatctct ttcgtgcctt aaggatagtg 600 
tttatatatt tatatatata tttcatatat 660 
tttcccaaac aatacaaaaa gcaaattaca 720 
gatgaaacat taattaggga tcttgctgct 780 
gccacagaat aattgatagg gcagctgttt 840 
ttaaatgaat tagaaactat ttgaggctat 900 
ctggtgaaat gctgatacat ctgatctatc 960 
taccatttag tcatctatcc gttcttcact 1020 
taagaggaat acctaaaggt ttttctaaat 1080 
ttttattttt taatattttg aaataaagtt 1140 
cttccatttc tttttctttg ttcccagcca 1200 
caactacatt accatgctgt gcctcagttt 1260 
tacctacctc acaggggtgt tgtgaggctc 1320 
ttctctgtat gtccagcact ttgtagccat 1380 
gttaatggaa tgatacggta cctgaaagcc 1440 
tgtacatact tataaccttg tatttggaaa 1500 
caaaaatcac atcatttgac caaagaataa 1560 
ttatattttc atcctgacca gcttagttct 1620 
ctttggatca attttggtca aacatgccaa 1680 
ttgggtttaa gatgcacttt tagcacacat 1740 
taatggtgat gttatttcaa tctaacagcc 1800 
ttctgtagtt ctttaaataa taatgaagct 1860 
ttccttcttc ttttagccca ctattaaaac 1920 
ctgaacagca cgcattactt gcttcctgaa 1980 
agttgctgtg gattatcaag ttttgaagga 204 0 
aagtgaaatg agtataatcc aagtaactgg 2100 
atggctaaga agatttgaat tatagggagg 2160 
tgagaagggg aaaaccttag atagagggac 2220 
tcagacatcc atttccagtg gcaaacagca 2280 
tgaagttggg tggttttatc caatgtctca 234 0 
aacaagtgct ggtcagccaa agaaacattc 2400 
attatctaag cacttgatca agaaatatac 2460 
tattatcaaa tttttaaaat acaagtttgg 2520 
cttaagccag ttgtcagtgg aggtcctcag 2580 
ctcaccatca tgataccaga tacaggtgaa 264 0 
tctcccttca aacaagatgc tgatttgtag 2700 
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wo 00/21986 



PCT/US99/23315 



ggtacttggc aggttaaatt aaaccagaag aggtgactta ataaaaaagg gaatgacatt 
tagggtataa agatctcata agaaatgtaa tatgtaaatt atatcttgct ttatgttgta 
aaatatacat tgtttgcgct agaatagaaa tgatttcttt tcaataaaaa gaaagaagga 
ctctaaaaaa aaaaaaaaaa 



2760 
2820 
2880 
2900 



<210> 17 

<211> 2507 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc^feature 

<223> Incyte ID No.: 3284695CB1 

<400> 17 

cagagtgaaa cttgtgcctg gtgaccaaag 
ttcaagccaa atatctgggt ttccccctct 
ccaagatagg agatatttcc catccccttc 
gcccaggagc ctattcctgg catggatgtt 
agacccttca agcagcctgg ctggggccca 
gcttttccct tcctcaccac ccaccacagc 
tccagcctga gccatgtgtg cccctgcggg 
tccctcccag catccctgcg gaaggagtca 
tgatggggaa gggttcccca gtccccacag 
ccttctgtgt cacggcgggc tgtgcaccca 
cactgcagta tattcttgcc aaagatttcc 
ttttgtaaat gtttatcttc ttctgtcttc 
attgttgaat ctgtgtgtca gccaggagag 
gggaaagggt ctgggagaag atgggcaaca 
acgcagaccc cagcaggttc agtcccgtgc 
gggaagaggg cagaggaggg tcatgtccct 
cgtggctttt tcccaaaggg agcaagaggg 
gacctgcgaa ggaaaacagg gaggaagtga 
ctggctctct tatttagcca ggcgcttaag 
aaggcctttg acccatgtca tctgagcgtc 
caatggccag gattccttct cccctggttt 
caggagaggg atggtggggc cagtggttgt 
aagtgtgatc cccctataaa cggctctcag 
tctgatgagc ctgtgcaggg gctccagggg 
gtgagtgtga tcaaatctag tctcactccc 
accacccctg cctcctggat cttctcccac 
tcctgtgagt caaggcagac acccaatcct 
ggggggcaga gtcccagagc agccctttac 
cgcgtttcct tggccagtgg taacacagga 
tgtgtgtgcg tgtgttttgc tcatttcttt 
tgggcaatgg aacttcaaat tcaatgtcgc 
ctgtaggcca accaattggt ggagtctcag 
gaggggcagg gtgggggcct cgggcagatc 
tccaaaatgt tggaggacct ctgttcatat 
ttactgtaga gggatgtccc aagcttgttt 
cctgtgtctg tgttttgttt gtgcgtgtgt 
tttccccatt tctctcctcc cttcagaccc 
cccaccaccc tccctgcctc ccaggccctc 
tccccacccc agctgtgtat ttatatagat 
gcctatagcc gctgccaccg tgtataaatc 
tgtattgtac actgacgcgt ccccactcct 
tgtatggctt tataaatgat aaagttaaag 



tccctccaaa gtgctcttcc ttctgggtta 60 
cctcattccc tagcaaaccc caattatctt 120 
ctttgtaaat atctcatctc ccactggaga 180 
ctgtccacac ttgaggctgg gcggtgtatc 240 
ggactgagtc tggggtcagc tttcacggtc 300 
ccaccttgca tgcatggcca gcccctccac 360 
aggacccatt catgccagaa agctggtaac 420 
gtttctgaga gtgtgacttt tcaaggcgaa 480 
tggccccacc tctgggccct gcaccagagc 540 
tgcacacacc tacgcacaca caacactccg 600 
tttaaaagca agcactttta ctaattatta 660 
tccctccctg aatctatttt actgttgttt 720 
cgctgtctgg ccttgaacat gggctgggat 780 
aagagccagg gagtcatgga catcgcagcg 840 
tgccaccagc tgtccagctg ggtgtctgga 900 
tcagctgggg gaggggccca gtgagctcca 960 
aaggattggg cgagaaaaca atggagaggg 1020 
gcggtttgat cagcctgcta tcacggtgtt 1080 
ggacagatac atcacatcct aagtttggga 1140 
tcctccagta gctctgaaag ctgtggacac 1200 
ttgaggatcc ctgggtcttc tgagactggc 1260 
gtgaaagcag gaggggcagc cctcctggac 1320 
gaggttagtg agtaggagat tctgccttgt 1380 
agcatgctgt ccagggggca cagaagggtg 1440 
acttttttag tctcactcct acttttgtcc 1500 
tttttttttc agctttagga cctggggaga 1560 
gcccccacac tcggggtcct ccaagaggtt 1620 
cccaggtcca ggccctggaa tcctgagact 1680 
cgtgtgtgcg catgtgcaag tgtggatgta 1740 
agggaacttg ggagtcgggg ttggaggtgc 1800 
ccagcagtga ggggagtcgg gaggtgaggc 1860 
cgatagccca ggtgagaagt ggttcaccca 1920 
tgtccctctt ggcccctctg tcctcaaatg 1980 
cccacgcctg ggctcttgcc agcagtggag 2040 
tccaatcagt gttaagctgt ttgaaactct 2100 
gtgagagcac atcagtgtgt gcaggctgtg 2160 
atcattgaga acaaatgtaa gaaatccctt 2220 
tgcgggggaa acaagatcac ccagcatcct 2280 
ggaaatatac tttatatttt gtatcatcgt 2340 
ctggtgtatg ctccttatcc tggacatgaa 24 00 
gtacagctgc tttgtttctt tgcaatgcat 24 60 
aaaaaaaaaa aaaaagg 2507 



<210> 18 
<211> 2929 
<212> DNA 
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wo 00^1986 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No.: 3481610CB1 



PCT/US99/23315 



<400> 18 

aagctcggaa 

aaaaaacaag 

actgttgatg 

atgaagctga 

ggagagccgg 

gagagtccca 

agaaatgact 

aagagcccct 

ggcaaagcgg 

cttgatctgc 

tctacggtta 

caatggacca 

cagtcaggag 

gtaatcaagt 

ctctggggca 

attgtcacca 

aactttgcag 

aggatttcaa 

tggaacttca 

gaaggtgatg 

atgtcccctg 

tatgttgggg 

gtgtggaaat 

atcgctgcct 

aatcgctaca 

tacctcagcg 

t tcattctgc 

ggctgcccac 

acgaggaaga 

atcccagcac 

atcctgaggc 

atcagcaaga 

accactgtgt 

ttccagggat 

ttgctgaata 

ggttcatcca 

tttggtaaaa 

aactcatcca 

tcccggggac 

tctcggggca 

cttttgtaaa 

cttgtgtgat 

tctctgtcta 

cacaaagata 

aattctaaga 

ggaaaaagaa 

gggcatattg 

agaatttttt 

ggggaatgga 



ttcggctcga 

tgtgctacaa 

tgtgttgtca 

atctggttcc 

ggaaagtcat 

ttggcgggac 

gcatctctgc 

ctcaggatga 

aacatgaaat 

tctcaacagt 

atatcatcct 

atcagagttc 

atagccctcc 

ccagccaccc 

atgtggtcat 

tggctttccc 

agagcttagt 

tgacttttaa 

ggcttgccaa 

gggacaatgt 

actccccaga 

tgggcttttc 

cggtgaccaa 

cccttctggt 

tactctgcaa 

tcttcttctg 

atgaaacaag 

ttgccatctc 

atgtctgttg 

tgatcattgt 

cttccattgg 

gcattggggt 

tcccagggac 

tattcatttt 

agttttcatt 

cacctgtgtt 

caggaacgta 

gtgcttcttc 

agtggctgtg 

ggtttccggg 

gacagaataa 

accacatgtg 

tattgtaata 

agctttgatt 

aggaaggaag 

gaaaaagaga 

taagatttcc 

tttttaatgg 

attacttttg 



gatgggttcc 

acacaatttc 

ctttaccaat 

tggggaaaac 

ccagaagcta 

catcacttac 

cccaataaac 

gatgctccct 

cagctcttct 

tccaacccaa 

tggcaagccc 

acagctacta 

attgtccttc 

agaaacctat 

tgacaagagc 

aactctccaa 

gatgacaacc 

gaacaatagc 

caacacaggg 

cacctgtatc 

tcctagttct 

catcttgagc 

gaatcggact 

cgccaacacc 

gacagcctgt 

gatgctgaca 

caggtccact 

ggtcatcacg 

gctcaactgg 

ggtggtgaac 

agacaagcca 

cctcacacca 

caaccttgtg 

actctttgga 

gtcgagatgg 

ttctatgagt 

taatgtttcc 

gttgctcaac 

cttttaaaaa 

agcagatgcc 

aaataattgt 

tatagtattt 

tagaatttcg 

aaagtagtaa 

gaagaaagga 

tagatgataa 

atgttaatga 

gcttcaaaaa 

ggggccagta 



tcatcccttc 

aatgcaagct 

gctgctaata 

atcacatgcc 

tgccggttct 

aaatgtgtag 

agtctgctcc 

acatacctga 

cctgggagtc 

gtaaattcag 

gtcttgaaca 

cattcagtgg 

tcccaaacta 

caacagaggt 

tacctagaaa 

gccatccttg 

actgtcagcc 

ccttcaggcg 

gggtgggaca 

tgtgaccacc 

ctcctgggaa 

ttggcagcct 

tcttatatgc 

tggttcattg 

gtggctgcca 

ctgggcctca 

cagaaagcca 

ctgggagcca 

gaggacacca 

ataaccatca 

tgcaagcagg 

ctcttgggcc 

ttccatatca 

tgcctctggg 

tcttcacagc 

tctccaatat 

accccagaag 

taagaacagg 

gagatgcttg 

aaaaagactt 

tatgtttctg 

aagtgaaact 

aagagacatt 

gtaaaaggct 

aggaaagaag 

taggaacaaa 

tctaatataa 

ttggaaaact 

tctttccttt 



ctgctgcaaa 

cagtttcctg 

attcagtctg 

aggatcccgt 

caaacgttcc 

gctcccagtg 

agatggctaa 

aggatctttc 

tgggagccat 

aaatgatgac 

cctggaaggt 

aaagattttc 

atgtgcagat 

ttgttttccc 

acttgcagtc 

ctcaggatat 

acaatacgac 

gcgaaacgaa 

gcagtgggtg 

taacatcatt 

tactcctgga 

gtctagttgt 

gccacacctg 

tggtcgctgc 

ccttcttcat 

tgctgttcta 

ttgccttctg 

cccagccccg 

aggccctgct 

ctattgtggt 

agaagagcag 

tcacttgggg 

tatttgccat 

atctgaaggt 

actcaaagtc 

caaggagatt 

caaccagctc 

ataatccaac 

caaagcaatg 

tttcatagag 

tttgttccct 

caagccctca 

ttcacttttt 

acctaggaaa 

ggagggaaac 

taaagacaaa 

tcactcagtg 

gtgaaagcta 

gattgttcc 



agaagttaac 

gtgttcaaaa 

gagcccatct 

aataggtgtc 

cagcagccct 

ggaggagaag 

ggctttgatc 

tattagcata 

tattaacatc 

gcacgtgctc 

tttacaacag 

ccaagcatta 

gagcagcatg 

atactttgac 

ggattcgtct 

ccaggaaaat 

tatgccattc 

gtgtgtcttc 

ctatgttgaa 

ctccatcctc 

tattatttct 

ggaagctgtg 

catagtgaat 

catccaggac 

ccacttcttc 

tcgcctggtt 

tcttggctat 

ggaagtctat 

ggctttcgcc 

catcaccaag 

cctgtttcag 

ttttggtctc 

cctcaatgtc 

acaggaagct 

aacatccctg 

taacaatttg 

atccctggaa 

ctacgtgacc 

gggaacgtgt 

aagaggcttt 

ccccctcccc 

aggcccaact 

acacattggg 

tacttcagtg 

agggagaaag 

caacattaag 

ccacattttg 

agtccattgg 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2929 



<210> 19 

<211> 1725 

<212> DNA 

<213> Homo sapiens 



15/20 
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<220> 

<221> misc^f eature 
<223> Incyte ID No. 




PCT/US99/23315 



3722004CB1 



<400> 19 

gaggcaagaa 

accgggggag 

ggtggaggcg 

agggcctgag 

gggaaaccag 

gctgacctct 

ttgcaaatcc 

ttccttttgg 

caggctccca 

acttgtccct 

ttcatgtctg 

gaccctggac 

tagataaaaa 

cctcactgaa 

acctgtgctc 

tctctctgtg 

agaaccagta 

gattcagccc 

tctccactct 

ccttttctta 

tgggtttctt 

cttaggtttg 

ttaatggaag 

tctcatctgg 

agccacacag 

ccttgtccag 

atctaggttt 

ctcatggctt 

tctcaaaaaa 



ttcggcacga 

cccgaacgag 

aggcaggaag 

gaggaggacg 

tacaagaaga 

gacctcactt 

ttttgaactg 

catcttaaaa 

gggtgcatgc 

tggctagcag 

ttcctgtggg 

tgggattttt 

gaacatttta 

gccaaaccac 

accagctccg 

gctggcttgg 

ccaggaattt 

tttcattgct 

gctatagcag 

gaaagtttga 

ggaattttat 

ttggttaaaa 

gctggggaat 

gcctggaacc 

tcattgcctt 

gctgggatct 

gtctggaaag 

ggatctctgt 

aaaaaaaaaa 



gggagagccc 

ggggatcccg 

aggagcagga 

gagaaggctt 

tgatgaccaa 

ccctgtagca 

aagaataacg 

gcttgagaga 

tgcctccata 

gatcctggga 

tcactttgtt 

cttaccactc 

aaagcagagt 

agaagacttt 

tcagggtggt 

ttgtcggggg 

acttgaccat 

aagacacctt 

aagcaataat 

tagattagtt 

atttgacaat 

cattttttta 

gtccagcatc 

tttggttcag 

caacacagag 

aattgataca 

tttccgaccc 

attcagcctt 

aggccggcgc 



gcgggcgtgg 

cggcggcgcc 

cttggatggt 

ctccttcaaa 

agaggagctg 

agttccttag 

aagttatcct 

taaaacggaa 

aatctgctga 

acacctttgg 

aagctgaaga 

aaacttgcta 

tcactttcac 

gaggaatgag 

cagccgaccc 

tgagatgcca 

tccccttatt 

ttcactgagg 

gtttgcttta 

agaacttcag 

atttatacta 

aagcagtaag 

aacccctatg 

ggcttagggg 

ccacgtgtcc 

ataggtcgtt 

tggcttatag 

tgttcagtcc 

aagcttattc 



gggagctcgg 

agcgaggcgg 

gagaaggggc 

tacagccccg 

gaggaggagc 

gtcctgagcc 

tagcgtcctc 

accccagaga 

gctctagacc 

ccctgccctg 

gttttaagag 

tccacacacc 

tccagtctcc 

agacaaatga 

ctttccctgg 

tattgattac 

tttcatctag 

ttcttaccag 

aaaagatttc 

atcatcagat 

taccaaactc 

tttatagaaa 

gcatgcattc 

agaacaggcc 

ccaaacagca 

gactccctcc 

gcaccacacc 

aataaacttt 

ctttt 



ggacctgcgg 

aggagcaggc 

catcatcgga 

ggaagctgag 

agaggattga 

acaaatattc 

ctaaaggctt 

ggagtctggg 

ctcaatcagg 

tgtagagatg 

gtagagctca 

ctgcacacct 

cctcttttgc 

ggtagagctc 

gaaccccact 

agggcagcaa 

aggaatctcg 

ctcagccaaa 

ttgacctatg 

cagtctcaaa 

atttgcagtt 

atgttttcat 

ccagtggcct 

acatggcaac 

atagtcatgc 

tagtagagct 

tcatgtactc 

gagtagatga 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1725 



<210> 20 

<211> 1987 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc__feature 

<223> Incyte ID No,: 3948614CB1 

<400> 20 

gacggccagt gcaagctaaa attaaccctc 
agctctcggc ctcggcttcg acgacggcaa 
gacggacgcg gggctgtaca cctgcaacct 
cctggccgtc cgcctggagg tcaccgacgg 
cgagaaggag gtgctggcgg tggcgcgcgg 
cgggcacgtg tggaccgacc ggcacgtgga 
gcagccgccc ggggtcccgc acgaccgcgc 
cgagcgccgc gcctacgggc ccctttttct 
ctttgagcgc ggtgacttct cactgcgtat 
ctactcctgc cacctgcacc accattactg 
gacggtcgcc gaaccccacg cggagccgcc 
ccacagcggc gccccaggcc cagaccccac 
catcgtcccc gagagccgag cccacttctt 
gctgctcttc atcctgctac tggtcactgt 
ctacgaatac tcggaccaga agtcgggaaa 
gttcgctgtg gctgcagggg accagatgct 
caaaaacaac atcctgaagg agagggcgga 



actaaaggga ataagcttgc ggccgcctgg 60 
cttctcgctg ctcatccgcg cggtggagga 120 
gcaccatcac tactgccacc tctacgagag 180 
ccccccggcc acccccgcct actgggacgg 240 
cgcacccgcg cttctgacct gcgtgaaccg 300 
ggaggctcaa caggtggtgc actgggaccg 360 
ggaccgcctg ctggacctct acgcgtcggg 420 
gcgcgaccgc gtggctgtgg gcgcggatgc 480 
cgagccgctg gaggtcgccg acgagggcac 540 
tggcctgcac gaacgccgcg tcttccacct 600 
cccccggggc tctccgggca acggctccag 660 
actggcgcgc ggccacaacg tcatcaatgt 720 
ccagcagctg ggctacgtgc tggccacgct 780 
cctcctggcc gcccgcaggc gccgcggagg 84 0 
gtcaaagggg aaggatgtta acttggcgga 900 
ttacaggagt gaggacatcc agctagatta 960 
gctggcccac agccccctgc ctgccaagta 1020 
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catcgaccta 
ggctgggcca 
ggggctcacc 
cagaggccgc 
cggcctttgc 
catcatgccc 
gacactccca 
actcctccag 
tcagaacttg 
gtgctgcctg 
ctgttctccc 
ggccacctgg 
tcagccacct 
gactctgcct 
acaggggagg 
tttgcatctg 
aaaaaaa 



gacaaagggt 
gcagctgcac 
ccccttccag 
ctccacaccc 
tcacgggtgg 
tcagaccctt 
tcagaacctg 
ggctctgctc 
gcagccttga 
ccaccaagag 
cagggacctg 
ggctgcaccc 
tgatagtcac 
gggctggagt 
gagtgaagtt 
ctggtggacc 



tccggaagga 
ctctcctgtc 
cggctggtcc 
ctcccccagg 
ccctgcccac 
ctgggctctg 
gcagccccaa 
gtccggggct 
agttggggtc 
ctcccccacc 
ctgacttgaa 
cctgcccttt 
tgggctccct 
ctagggctgg 
ggtttggggt 
tgccaccatc 



gaactgcaaa 
tgtgctcctc 
cgctttcctg 
ggcttggtgg 
ccctggcaca 
cccgctgggg 
aactggggtc 
gggagatgtt 
agcctcggca 
tgtaccacca 
tgccagccct 
ctctgcccca 
gtgacttctg 
ggctacattt 
ggcctgtgtt 
acaataaagt 



tagggaggcc 
ggggcatctc 
gaatttggcc 
cagcatagcc 
accaaaatcc 
gcctgaagac 
agcctcaggg 
cctggaggag 
ggagtcccac 
tgtgggactc 
tgctcctctg 
tccctaccct 
accctgacac 
ggcttctgta 
gccactctca 
ccccatctga 



ctgggctcct 
ctgatgctcc 
tgggcgtatg 
cccacccctg 
cactgatgcc 
attcctggag 
caggagtccc 
gacactccca 
tcctcctggg 
caggcaccat 
tgttgctttg 
agccttgctc 
ccctcccttg 
ctggctgagg 
gcaccccaca 
tttttaaaaa 



1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
1987 



<210> 21 
<211> 551 
<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No.: 627722CD1 

<400> 21 



Met 


Glu 


Glu 


Ala 


Glu 


Leu 


Val 


Lys 


Gly Arg 


Leu 


Gin 


Ala 


He 


Thr 


1 








5 










10 










15 


Asp 


Lys 


Arg 


Lys 


He 


Gin 


Glu 


Glu 


He 


Ser 


Gin 


Lys 


Arg 


Leu 


Lys 


20 










25 










30 


He 


Glu 


Glu 


Asp 


Lys 


Leu 


Lys 


His 


Gin 


His 


Leu 


Lys 


Lys 


Lys 


Ala 








35 










40 










45 


Leu 


Arg 


Glu 


Lys 


Trp 


Leu 


Leu 


Asp 


Gly 


He 


Ser 


Ser 


Gly 


Lys 


Glu 






50 










55 










60 


Gin 


Glu 


Glu 


Met 


Lys 


Lys 


Gin 


Asn 


Gin 


Gin 


Asp 


Gin 


His 


Gin 


He 










65 








70 










75 


Gin 


Val 


Leu 


Glu 


Gin 
80 


Ser 


lie 


Leu 


Arg 


Leu 

85 


Glu 


Lys 


Glu 


He 


Gin 
90 


Asp 


Leu 


Glu 


Lys 


Ala 


Glu 


Leu 


Gin 


He 


Ser 


Thr 


Lys 


Glu 


Glu 


Ala 






95 










100 










105 


He 


Leu 


Lys 


Lys 


Leu 


Lys 


Ser 


He 


Glu 


Arg 


Thr 


Thr 


Glu 


Asp 


He 






110 










115 










120 


He 


Arg 


Ser 


Val 


Lys 


Val 


Glu 


Arg 


Glu 


Glu 


Arg 


Ala 


Glu 


Glu 


Ser 








125 










130 










135 


He 


Glu 


Asp 


He 


Tyr 


Ala 


Asn 


He 


Pro Asp 


Leu 


Pro 


Lys 


Ser 


Tyr 








140 










145 










150 


He 


Pro 


Ser 


Arg 


Leu 


Arg 


Lys 


Glu 


He 


Asn 


Glu 


Glu 


Lys 


Glu 


Asp 








155 






160 










165 


Asp 


Glu 


Gin 


Asn 


Arg 


Lys 


Ala 


Leu 


Tyr 


Ala 


Met 


Glu 


He 


Lys 


Val 








170 








175 










180 


Glu 


Lys 


Asp 


Leu 


Lys 


Thr 


Gly 


Glu 


Ser 


Thr 


Val 


Leu 


Ser 


Ser 


He 






185 










190 










195 


Pro 


Leu 


Pro 


Ser 


Asp 


Asp 


Phe 


Lys 


Gly Thr Gly 


He 


Lys 


Val 


Tyr 










200 










205 










210 


Asp 


Asp 


Gly 


Gin 


Lys 


Ser 


Val 


Tyr 


Ala 


Val 


Ser 


Ser 


Asn 


His 


Ser 






215 










220 










225 


Ala 


Ala 


Tyr 


Asn 


Gly 


Thr 


Asp 


Gly 


Leu 


Ala 


Pro 


Val 


Glu 


Val 


Glu 








230 










235 










240 


Glu 


Leu 


Leu 


Arg 


Gin 


Ala 


Ser 


Glu 


Arg 


Asn 


Ser 


Lys 


Ser 


Pro 


Thr 








245 










250 










255 
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Glu 


Tyr 


His 


Glu 


Pro 
260 


Val 


Tyr 


Thr 


Pro 


Gin 


Arg 


Glu 
275 


Thr 


Val 


Arg 


He 


Lys 


He 


Lys 
290 


Thr 


Asn 


Ser 


He 


His 


Asn 


Met 
305 


Gly 


Asn 


Asn 


Phe 


Asn 


His 


He 
320 


Ser 


Pro 


Ser 


Val 


He 


Gin 


Gin 
335 


Ala 


Glu 


Arg 


Leu 


Met 


Thr 


Pro 
350 


Trp 


Glu 


Asp 


Ala 


Pro 


Ser 


Pro 
365 


Lys 


Pro 


Phe 


Gly 


Lys 


Ser 


Glu 

380 


His 


Gin 


Asp 


Glu 


Glu 


Asp 


Val 
395 


Arg 


Tyr 


Asp 


He 


Asn 


Asp 


Thr 
410 


Glu 


Pro 


Gin 


Gin 


Ala 


Glu 


Asp 
425 


Ser 


Glu 


Tyr 


Asp 


Gly 


He 


He 
440 


His 


Ala 


Glu 


Glu 


Glu 


Asp 


Glu 
455 


Gly 


Glu 


He 


Ala 


Pro 


His 


Ser 
470 


Gin 


Val 


Leu 


Pro Arg 


Lys 


Arg 


Ser 


Glu 










485 






His 


Lys 


Ser 


Pro 


His 

500 


Lys 


Asn 


Glu 


Ser 


Leu 


Gly 


Ser 
515 


Pro 


Val 


Thr 


Thr 


Gly 


Asp 


Gly Thr 


Glu 










530 






Met 


Arg 


Met 


Ala 


Lys 
545 


Leu 


Gly 



Ala 


Asn 


Pro 


Phe 


Tyr 


Arg 


Pro 


Thr 






265 










270 


Thr 


Pro 


Gly 


Pro 


Asn 


Phe 


Gin 


Glu 






280 










285 


Gly 


Leu 


Gly 


He 


Gly 


Val 


Asn 


Glu 






295 










300 


Gly 


Leu 


Ser 


Glu 


Glu 


Arg 


Gly 


Asn 






310 










315 


He 


Pro 


Pro 


Val 


Pro 


His 


Pro 


Arg 






325 










330 


Glu 


Lys 


Leu 


His 


Thr 


Pro 


Gin 


Lys 






340 










345 


Glu 


Ser 


Asn 


Val 


Met 


Gin 


Asp 


Lys 






355 










360 


Arg 


Leu 


Ser 


Pro 


Arg 


Glu 


Thr 


He 






370 










375 


Asn 


Ser 


Ser 


Pro 


Thr 


Cys 


Gin 


Glu 






385 










390 


Asn 


He 


Val 


His 


Ser 


Leu 


Pro 


Pro 






400 










405 


Val 


Thr 


Met 


He 


Phe 


Met 


Gly 


Tyr 






415 










420 


Glu 


Asp 


Lys 


Lys 


Phe 


Leu 


Thr 


Gly 






430 










435 


Glu 


Leu 


Val 


Val 


He 


Asp 


Asp 


Glu 






445 










450 


Ala 


Glu 


Lys 


Pro 


Ser 


Tyr 


His 


Pro 






460 










465 


Tyr 


Gin 


Pro 


Ala 


Lys 


Pro 


Thr 


Pro 






475 










480 


Ala 


Ser 


Pro 


His 


Glu 


Asn 


Thr 


Asn 






4 90 










4 yo 


Ser 


He 


Ser 


Leu 


Lys 


Glu 


Gin 


Glu 






505 










510 


His 


His 


Ser 


Pro 


Phe 


Asp 


Ala 


Gin 






520 










525 


Asp 


Pro 


Ser 


Leu 


Thr 


Ala 


Leu 


Arg 






535 










540 


Lys 


Lys 


Val 


He 











550 



<210> 22 
<211> 99 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No.: 1556751CD1 

<400> 22 



Met 


Glu 


Ala 


Leu 


Ala 


Asn 


Val 


Asn 


Phe 


Pro 


Arg 


Lys 


Ser 


Phe 


Arg 


1 








5 










10 










15 


Pro 


Glu 


Asp 


Ala 


Gly 


Lys 


Glu 


Ser 


Gly 


Ser 


Gin 


Gly Gly 


Phe 


Cys 










20 










25 










30 


Val 


Pro 


Ala 


Ala 


Arg 


Pro 


Gin 


Thr 


Met 


Val 


Thr 


Gly 


Pro 


Ser 


Cys 










35 










40 










45 


Ser 


Ser 


Pro 


Gly 


Leu 


Gin 


Asn 


Phe 


Ser 


Pro 


Gin 


Arg 


Lys 


Glu 


Asn 








50 










55 










60 


Arg 


Ala 


Cys 


Ala 


Cys 


Trp 


Gin 


Asn 


Ala 


Gly 


Pro 


Ala 


Pro 


Lys 


Asn 










65 










- 70 










75 


Pro 


Met 


Cys 


Val 


Arg 


Leu 


Lys 


Val 


Gly Arg 


Pro 


Gin 


Ala 


Ser 


Gin 
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80 85 90 

Arg Lys Leu Lys Glu Thr Gly Leu Cys 
95 



<210> 23 
<211> 493 
<212> PRT 

<213> Homo sapiens 



<220> 

<221> misc_f eature 

<223> Incyte ID No.: 2268890CD1 



<400> 23 






Met 


Arg 


Pro 


Leu 


Cys 


1 








5 


Ala 


Met 


Gly 


Ala 


Val 










20 


Glu 


Glu 


Gly 


Ser 


Pro 










35 


Arg 


Ala 


Glv 

vj J. jr 


Glu 


Ser 










50 


Piro 


Gin 


Gin 


Arg 


Val 










65 


Piro 


Glu 


Val 


Leu 


Leu 










80 


Leu 


Leu 


Asn 


Asn 


Glu 










95 


Leu 


Gin 


Gin 


Leu 


Val 










110 


Lvs 


Leu 


Leu 


Arg 


Lys 










125 


Gin 


Leu 


Tyr 


Met 


Gin 










140 


Asn 


Ala 


Leu 


Glu 


Leu 










155 


Thr 


Ala 


Asp 


Met 


Leu 










170 


His 


Lys 


Tyr 


Gin 


His 










185 


lie 


He 


Ala 


Gin 


Leu 










200 


Arg 


Pro 


Val 


Pro 


Gin 








215 


Gin 


Pro 


Pro 


Thr 


Tyr 










230 


Glu 


He 


Gin 


Ser 


Asp 










245 


Pro 


Thr 


Met 


Pro 


Thr 










260 


Pro 


Ser 


Gly 


Pro 


Trp 










275 


His 


Asp 


Thr 


Ser 


Ser 








290 


Arg 


Leu 


Met 


Gin 


Val 








305 


Trp 


Thr 


Val 


He 


Gin 








320 


Arg 


Asn 


Trp 


Glu 


Thr 










335 


Glu 


Tyr 


Trp 


Leu 


Gly 










350 



Val 


Thr 


Cys 


Trp 


Trp 
10 


Ala 


Gly 


Gin 


Glu 


Asp 








25 


Arg 


Glu 


Phe 


He 


Tyr 
40 


Gin 


Asp 


Lys 


Cys 


Thr 
55 


Thr 


Gly 


Ala 


He 


Cys 
70 


Glu 


Asn 


Arg 


Val 


His 
85 


Leu 


Leu 


Lys 


Gin 


Lys 
100 


Glu 


Val 


Asp 


Gly 


Gly 
115 


Glu 


Ser 


Arg 


Asn 


Met 
130 


Leu 


Leu 


His 


Glu 


He 
145 


Ser 


Gin 


Leu 


Glu 


Asn 
160 


Gin 


Leu 


Ala 


Ser 


Lys 
175 


Leu 


Ala 


Thr 


Leu 


Ala 

190 


Glu 


Glu 


His 


Cys 


Gin 
205 


Pro 


Pro 


Pro 


Ala 


Ala 
220 


Asn 


Arg 


He 


He 


Asn 








235 


Gin 


Asn 


Leu 


Lys 


Val 
250 


Leu 


Thr 


Ser 


Leu 


Pro 
265 


Arg 


Asp 


Cys 


Leu 


Gin 








280 


He 


Tyr 


Leu 


Val 


Lys 
295 


Trp 


Cys 


Asp 


Gin 


Arg 
310 


Arg 


Arg 


Leu 


Asp 


Gly 
325 


Tyr 


Lys 


Gin 


Gly 


Phe 
340 


Leu 


Glu 


Asn 


He 


Tyr 
355 



Leu 


Gly 


Leu 


Leu 


Ala 








15 


Gly 


Phe 


Glu 


Gly 


Thr 








30 


Leu 


Asn 


Arg 


Tyr 


Lys 










45 


Tyr 


Thr 


Phe 


He 


Val 








60 


Val 


Asn 


Ser 


Lys 


Glu 










75 


Lys 


Gin 


Glu 


Leu 


Glu 








90 


Arg 


Gin 


He 


Glu 


Thr 








105 


He 


Val 


Ser 


Glu 


Val 










120 


Asn 


Ser 


Arg 


Val 


Thr 










135 


He 


Arg 


Lys 


Arg 


Asp 










150 


Arg 


He 


Leu 


Asn 


Gin 








165 


Tyr 


Lys 


Asp 


Leu 


Glu 










180 


His 


Asn 


Gin 


Ser 


Glu 










195 


Arg 


Val 


Pro 


Ser 


Ala 








210 


Pro 


Pro 


Arg 


Val 


Tyr 










225 


Gin 


He 


Ser 


Thr 


Asn 










240 


Leu 


Pro 


Pro 


Pro 


Leu 










255 


Ser 


Ser Thr Asp 


Lys 










270 


Ala 


Leu 


Glu 


Asp 


Gly 










285 


Pro 


Glu 


Asn 


Thr 


Asn 










300 


His 


Asp 


Pro 


Gly 


Gly 










315 


Ser 


Val 


Asn 


Phe 


Phe 










330 


Gly 


Asn 


He 


Asp 


Gly 








345 


Trp 


Leu 


Thr 


Asn 


Gin 








360 
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Gly Asn 


iyr 


Lys 


Leu 


Leu 


Val 


Thr Met 


Glu 


Asp 


Tro 


Ser 


Gly 


Arg 














370 










375 


Lys 


va± 


Phe 


Ala 


Glu 


Tyr 


Ala 


Ser Phe 


Arg 


Leu 


Glu 


Pro 


Glu 


Ser 














385 










390 


Glu 


Tyr 


J. yj. 


Lys 


Leu 


Arg 


Leu 


Gly Arg 


Tyr 


His 


Gly 


Asn 


Ala 


Gly 












400 










405 


Asp 


Ser 


Phe 


Thr 


Trp 


His 


Asn 


Gly Lys 


Gin 


Phe 


Thr 


Thr 


Leu Asp 








410 








415 










420 


Arg 


Asp 


His 


Asp 


Val 


Tyr Thr Gly Asn 


Cvs 


Ala 


His 


Tyr 


Gin 


Lys 




425 








430 










435 


Gly Gly 


irp 


X rp 


Tyr 


Asn 


Ala 


Cys Ala 


His 


Ser 


Asn 


Leu 


Asn 


Gly 










440 








445 










450 


Val 


Trp 


Tyr 


Arg 


Gly Gly 


His 


Tyr Arg 


Ser 


Arg 


Tyr 


Gin 


Asp 


Gly 






455 








460 










465 


Val 


Tyr 


Trp 


Ala 


Glu 


Phe 


Arg 


Gly Gly 


Ser 


Tyr 


Ser 


Leu 


Lys 


Lys 






470 








475 










480 


Val 


Val 


Met 


Met 


lie 


Arg 


Pro 


Asn Pro 


Asn 


Thr 


Phe 


His 














485 






490 
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Although claims 10-12 are directed to a method of treatment of the 
human/animal body, the search has been carried out and based on the alleged 
effects of the compound/composition. 

Claims Nos.: 2,7 

because they relate to parts of the International Application that do not comply with the prescribed requirements to such 
an extent that no meaningful International Search can be carried out, specifically- 

See FURTHER INFORMATION sheet PCT/ISA/210 



Claims Nos.: 

because they are dependent claims and are not drafted in accordance with the second and third sentences of Rule 6.4(a). 
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Claims Nos.: 2»7 



Claims 2 and 7 are -inter alia- drafted to compounds comprising as few 
as 18 nucleotides / 6 amino acids, which are neither defined by their 
exact structure, not by their exact location in the parent molecule. 
Moreover, claim 2 covers 

- polynucleotides of completely undefined length that hybridize to said 
short sequence (or the parent), and 

- sequences which are complementary to the almost undefined seqeunces 
mentioned above. 

Said vague structural definitions are not sufficient for a reasonable 
search. 

The applicant's attention is drawn to the fact that claims, or parts of 
claims, relating to inventions in respect of which no international 
search report has been established need not be the subject of an 
international preliminary examination (Rule 65.1(e) PCT) . The applicant 
is advised that the EPO policy when acting as an International 
Preliminary Examining Authority is normally not to carry out a 
preliminary examination on matter which has not been searched. This is 
the case irrespective of whether or not the claims are amended following 
receipt of the search report or during any Chapter II procedure. 
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1. Claims: 1-3,5-7,9,10,12 (all partially) 

A polynucleotide having the SEQ ID NO 1; subject-matter of 
claims 2,3,5,6,9,12 relating to said polynucleotide 



2. Claims: 1-12, all partially 

Polypeptide with the SEQ ID NO, 21, encoding 
polynucleotides, and related subject-matter 



3. Claims: 1-12, all partially 

Polypeptide with the SEQ ID NO. 22, encoding 
polynucleotides, and related subject-matter 



4. Claims: 1-12, all partially 

Polypeptide with the SEQ ID NO. 23, encoding 
polynucleotides, and related subject-matter 



5. Claims: 1-3,5-7,9,10,12 (all partially) 
INVENTIONS 5-20: 

Polynucleotides having the SEQ ID NOs. 3-5,7-10,12-20; 
subject-matter of claims 2,3,5,6,9,12 relating to said 
polynucleotide. 



6. Claims: 1,3 (partially) 
INVENTION 21: 

Further polynucleotides and encoded polypeptides of claim 1 
(PLEASE NOTE: This subject has been added for completeness 
only. It is considered as being at least partially 
anticipated (infra); Further non-unity objections may arise 
upon paiment for subject 21) 



